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Sir: 

I, James V. Oberthaler, whose address is 7701 Woodmont Avenue, 
Apt. 406, Bethesda MD, 20814, declare as follows: 

1. I received a BS degree in Mathematics from the College of 
the Holy Cross. I also did graduate work in Mathematics at the 
University of Maryland and the American University, completing all 
requirements for an MA in Mathematics except the dissertation. I 
hold a Masters degree in Business Administration from Southern 
Illinois University at Edwardsville, IL. My work experience includes 
over 4 0 years of software design and engineering and management of 
these activities. My relevant experience is described briefly in the 
Curriculum Vitae that accompanies this letter. I currently serve as 
Principal Research Analyst, under contract to the National Cancer 
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Institute Center for Bioinf ormatics . 

2. I have read the patent specification for application 
Serial No. 09/866,925 as filed in the United States Patent & 
Trademark Office on May 30, 2001 for "ALGORITHMIC DETERMINATION OF 
FLANKING DNA SEQUENCES THAT CONTROL THE EXPRESSION OF SETS OF GENES 
IN PROKARYOTIC, ARCHEA AND EUKARYOTIC GENOMES . " I read the amended 
claims submitted October 30, 2002, and I have read the amended 
claims submitted with the amendment accompanying this Declaration. 

I have read the official communication from the U.S. Patent & 
Trademark Office dated January 8, 2003. I have considered all of 
the claims. 

3. I wish first to direct my comments to claims 20 - 27 
which, I have been advised, are the broadest claims in the 
application. I have been also advised that claims 28 - 37 are all 
dependent from claim 20 and, hence, include the limitations of claim 
20. 

Claims 20 - 27 are as follows: 

20. A method of identifying DNA sequences that control the 
expression of different collections of genes in a genome comprising 
detecting, by computer, one or more pairs of non-adjacent DNA 
sequences to which are bound one RNA molecule comprising of two RNA 
sequences . 
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21. A method of identifying DNA sequences that control the 
expression of different collections of genes in a genome comprising 
detecting, by computer, changes in connectron behavior in the gencme 
as a function of changes in the sequence of the genome. 

22. A method of modifying, by computer, the expression of 
different gene collections in a gencme, comprising detecting changes 
in connectron behavior that results in changes in the level of 
connectron control sequences caused by an exogenous stimulus. 

23. A method of detecting, by computer, where and when new 
genes have been integrated into a host genome comprising detecting 
the operable link between the newly introduced gene and the existing 
connectron behavior in said host genome. 

24. A method of detecting, by computer, the expression effect 
of different gene collections in a given host genome, comprising 
detecting the transacting behavior of connectrons between the 
chromosomes thereof. 

25. A method of modifying a given genome comprising 
modifying, by computer, the connectron organization therein. 

26. A method of detecting, by computer, connectron control 
and target sequences in a given genome comprising: 
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determining the base composition of said genome, 

determining one or more sites of control sequence organization, 

and /or 

determining one or more sites of target application. 

27. A method of determining, by computer, the response of a 
cell in any tissue to changes in the cell' s environment and/or 
genetic composition comprising providing a complete genomic DNA 
sequence for the organism and determining the effect of changes in 
connectrons due to application of a given exogenous stimulus to the 
genome . 

4. It should be noted that all of the claims, including 
the dependent claims, are directed to a tetradic relationship 
between two specific adjacent RNA single-stranded sequences (called 
Cl and C2 for control sequence 1 and control sequence 2), which 
interact with two distant double-stranded DNA sequences (called Tl 
and T2 for target sequence 1 and target sequence 2) . Claim 20 
refers to a computer algorithm for identifying DNA sequences that 
control the expression of different collections of genes in a genome 
by identifying one or more pairs of non-adjacent DNA sequences to 
which are bound one RNA molecule containing two RNA sequences. 
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5. Referring to pages 5 and 6 of the Examiner's action and 
in reference to subparagraphs a) through h) , I wish to state the 
following: 

In subpagraph a) , the Examiner first states that in order to 
practice the claimed invention, one skilled in the art must identify 
and use a connectron to predict the relation of gene expression. 
Keeping in mind that the claims under consideration are directed to 
computer mediated methods of analysis of connectron sequences, I 
disagree with the Examiner's conclusion for the following reasons: 

(1) Claim 20 asserts that this patent application 
provides a mechanism (i.e., a computer algorithm) whereby one 
skilled in the art (i.e., a journeyman, molecular biologist, 
bioinformatician, or computer programmer who understands the 
storage format, content and use of readily available 
bioinf ormatics resources) can write software, following the 
algorithm, that will analyze the DNA sequence of an organism to 
identify DNA sequences (called CI, C2, Tl and T2 in the 
description of the algorithm) meeting specific criteria set 
forth in the description. And, further, that the identified 
sequences behave in such a way that when the control sequence 
containing CI and C2 is transcribed into RNA, the RNA will seek 
out and bind to the target sequence (CI binding to Tl and C2 
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binding to T2) to achieve the effect that the entire DNA 
sequence beginning with Tl and ending with T2 is shielded from 
transcription 

(2) The software, implemented following the algorithm 
and set to work on standard, readily available genome 
sequences, will identify the collection of DNA and RNA 
sequences making up connectrons as defined in the patent 
application without any need for understanding by or help from 
the computer programmer. 

Further, in subparagraph b) , the Examiner states 
that the description provides guidance to identify connectron 
symmetries in genomic sequences, and I agree. However, the 
Examiner also contends that the description does not provide 
detailed guidance to use identified connectron symmetries 
to predict an effect on gene expression, and with respect to 
this contingent I disagree for same reasons stated in 
paragraphs (1) and (2) above. 

In subparagraph c) , the Examiner contends that the description 
provides working examples of identification of connectron symmetries 
in genomic sequences, and I agree. See pages 37 and 38 of the 
specification for a listing of the examples, and pages 39 - 188 for 
detailed expositions of sample uses of the algorithm. The Examiner 
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further contends that the description does not provide working 
examples of using identified connectron symmetries to predict 
effects on gene expression. 

I disagree. On the contrary, this is exactly what the examples 
provide. As explained in the introduction and in the definitions 
provided, (particularly, the definitions of Possible Connectron and 
Hierarchy of Connectron Action) each connectron control sequence Cl- 
C2 will, when transcribed into RNA, seek out and bind to its target 
sequence T1-T2, thereby shielding the DNA between Tl and T2 from 
transcription. Since the shielded DNA sequence will not be 
transcribed, any genes in the span between Tl and T2 will not be 
expressed as proteins for as long as the C1-C2 sequence remains 
bound to T1-T2. Similarly, any additional C1-C2 sequences in the 
span between Tl and T2 will also remain inactive during this time 
period, and therefore the inhibitory effect they otherwise would 
have exerted on their target sequences will be suppressed during 
this time period. Granted that the full, cascading sequence of 
transcription/expression and sequestration that would result from 
each of the examples discussed is not presented, the principles are 
given that would enable anyone who understands the mechanism, as 
explained in the application, to follow the effects as deeply as he 
or she desires. 
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In subparagraph d) , the Examiner states that the nature of the 
invention, gene expression control, is complex. I agree for the 
reasons stated in the preceding paragraph and for the even more 
fundamental reason that the molecular-biological processes of even 
the simplest cell are very complex. Life is very complex: a fully 
formed organism with incredibly complicated biological activities 
develops from a single cell and lives a full lifetime by interacting 
in countless ways with its external environment. It would often be 
impossible to enumerate to completion all the effects that even a 
single connectron turning on or off would cause. Some cells would 
cycle through millions of possible states before repeating one. The 
only reasonable presentation of these effects is to give the 
principles that apply, and this has been done clearly and completely 
in the application. 

In subparagraph e) , the Examiner asserts that the prior art 
does not show connectrons; and for the purposes of this Declaration, 
I am assuming that the connectrons have the definition given above. 
I agree with the Examiner's contention. Mattick does not show 
connectrons as defined in the instant specification. It is my 
understanding and belief that the connectron invention disclosed in 
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the present application was made by the inventor, Richard J. 
Feldmann. 



In subparagraph f ) , the Examiner contends that the skill of the 
art of gene expression is high, and I agree. 

In subparagraph g) , the Examiner contends that the 
predictability of the relationship of connectron symmetries and gene 
expression is unknown in the prior art, and I agree. Although I do 
not present myself as an expert in genetics or molecular biology, it 
is clear from the nature of the publications appearing at the time 
this letter is written that laboratory researchers are only now 
beginning to encounter, case by case, the effects disclosed by Mr. 
Feldmann' s fully-formed connectron invention. To the best of my 
knowledge, these investigators are discovering individual 
applications of the invention, but no one except Mr. Feldmann has 
yet disclosed the overarching theory and its implications. 

In subparagraph h) , the Examiner contends that the claims are 
broad in that they are drawn to identification of connectron 
symmetries whose relationship to gene expression is not established. 
While I am not cognizant of the legal terms or definitions for the 
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breadth of claims, my understanding of the breadth of method claim 
20, for example, is that it requires detecting a DNA sequence that 
controls the expression of different collections of genes in a 
genome comprising detecting, by computer, one or more pairs of non- 
adjacent DNA sequences to which are bound RNA molecule comprising 
two RNA sequences . 

Claim 21 is directed to a method of identifying DNA sequences 
that control the expression of different collections of genes in a 
genome comprising detecting, by computer, changes in connectron 
behavior in the genome as a function of changes in the sequence of 
the genome . 

Claim 25 recites the method of changing a given genome 
comprising modifying, by computer, the connectron organization 
therein. 

Claim 26 is directed to: 

a method of detecting, by computer, connectron 
control and target sequences in a given genome 
comprising: 

determining the base composition of said 
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genome, 

determining one or more sites of control 
sequence organization and/or 

determining one or more sites of target 
application. 

I agree that the skilled practitioner would turn to the 
description and drawings provided in the application for guidance in 
using the claimed invention. The specification provides a detailed 
roadmap for practicing the invention by one skilled in the art. 
Referring specifically to the specification and drawings, the 
introduction at pages 1-3 provides a basic description of 
connectron structure. Figures 1-3 are taken from the text by 
Alberts et al. entitled "The Molecular Biology of the Cell." Pages 
3-25. Pages 26 - 36 provides a detailed description of a 
connectron structure. Page 31, the detailed description of the 
invention, provides a descriptive analysis of the flow diagrams 
utilized in the computer analysis of connectrons in any given 
genome . 

Ten samples of connectrons found by computer mediation are set 
out in the specification. Pages 39 - 56 give an example of a 
prokaryote connectron - E. coli. I have considered this example as 



Serial No. 09/866,925 Page 12 

well as all examples given against the backdrop of the Examiner's 
contention that the description lacks clear evidence of the 
connectron symmetries as related to gene expression and in my 
opinion that the skilled practitioner would not have any difficulty 
in practicing the invention from these descriptions for the 
following reasons: 

(1) The flowcharting conventions used are typical of those 
used to present computer algorithms. Together, they provide all the 
detail required for a complete implementation. 

(2) A wide variety of computer languages could be used to 
implement the algorithm. Any procedural third generation language 
could be used. 

(3) These skills are well within the competence of even 
journeyman programmers using languages such as Fortran, Cobol, PL-I, 
ALGOL< Pascal, etc., as well as more modern languages such as C, 
C++, etc. 

(4) Computers with the necessary performance and capacity are 
readily available for an amount that is well within the reach of 
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many home budgets, let alone the resources available to corporations 
and research institutions. 



Finally, in conclusion, I disagree with the Examiner 1 s 
contention that the trial and error experimentation required to 
practice the invention amounts to undue experimentation for the 
following reasons : 

(1) As stated earlier, the algorithms presented are 
straightforward and complete. 

(2) No experimentation whatsoever is required. Implementing 
the algorithms is a routine exercise in program design, coding and 
debugging. Running them is simply a matter of obtaining the 
organism-specific genomes and allowing the computer programs to go 
to work on them. 

(3) The only part of the activity that could conceivably be 
referred to as "experimenting" is the investigation into available 
bioinformatics resources, such as the syntax and semantics of the 
resources provided by, for example, that National Library of 
Medicine's National Center for Biotechnology Information (NCBI) . It 
is clear that in this context, having a ready understanding of this 
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information is a reasonable characteristic of one who could be 
called "skilled in the art." 

I hereby declare that all statements made herein of my own 
knowledge are true and that all statements made on information and 
belief are believed to be true; and further that these statements 
were made with the knowledge that willful false statements and the 
like so made are punishable by fine or imprisonment, or both, under 
Section 1001 of Title 18 of the United States Code and that such 
willful false statements may jeopardize the validity of the 
application or any patent issued thereon. 



Date: 



Name: 
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Summary of Experience 



Hands-on manager, skilled in oral and written communications, analysis, problem solving, and 
strategic planning 

Member of Federal Senior Executive Service with significant experience in managing a wide range 
of development and operational/support activities, developing and integrating systems in a variety 
of hardware and software platforms 

Information technology policy making at senior levels in the Federal Government, responsibility for 
oversight and review of major information systems. 

Led major efforts in systems quality assurance, configuration management, information systems 
security, strategic planning, and systems operation. 

Strong background in systems design and development in a broad variety of architectures and 
programming languages. 



Professional Experience 



Independent Consultant 



2000 - present 



Provides systems engineering and technical assistance to the National Cancer Institute's Center for 
Bioinformatics in developing enterprise-wide scientific information systems. The systems include 

■ The Enterprise Vocabulary System (NCI-EVS), an on-line thesaurus that provides a common 
vocabulary for scientific and administrative personnel throughout the Institute; and, 

■ The Intramural Research Directory, a comprehensive system containing information describing 
research investigations conducted by NCI employees. 

Reviews work done by other contractors; conducts analyses leading to improved business processes, 
coordinates implementation, ongoing operations, EVS project governance activities, etc. 

Project Leader, Synectics for Management Decisions, Inc. 1994 - 2000 

Managed projects at the Bureaus of the Census, Department of Commerce and the Substance Abuse and 
Mental Health Services Administration, DHHS 



Director for Systems Engineering and Evaluation (1988-1991) 

Monitored the work of the Automated Patent System (APS) Integration Contractor in designing 
and building APS components; improved the contractor's systems for quality assurance, 
configuration management, component error-handling, and reporting. Conceptualized, developed a 
comprehensive requirements document and detailed implementation plans for PTOnet, PTO's 
campus area network.. 

Director of Central Computer Operations (1991-1994. 

Led an office of 130 government employees and numerous contract personnel in operating, 
maintaining, enhancing, and supporting the production information systems and basic computer 
systems infrastructure of the PTO. 



U.S. Patent and Trademark Office (PTO) 



1988- 1994 



Deputy Assistant Secretary for Information Resources Management, 

and Deputy Assistant Secretary for Management Anaiysis and Systems, DHHS 



1987- 1988 



Advised the DHHS Secretary and the Assistant Secretary for Management and Budget (ASMB) on issues 
and policies pertaining to the use of information resources. 

Initiated projects of strategic importance: Departmeni-wide electronic linkage for exchanging messages and 
files: a backbone campus-area network for DHHS headquarters and adjoining buildings; a standard Local 
Area Network for elements of the Office of the Secretary; an infrastructure of intelligent w orkstations and 
support for ASMB, including DHHS regional operations. 

Division of Computer Research and Technology (DCRT) 

National Institutes of Health, DHHS 1969 - 1987 

Assistant Director, DCRT (1983—1987) 

Assistant to Chief, NIH Computer Center (1977—1983). 

Head, Systems Team, NIH Computer Center (1972— 1977) 

Directed systems development programming for the computer center. 

Systems Programmer (1969 — 1972). 

Participated in designing and writing the NIH Shared Spool System, a major modification to the 
IBM Operating System. This system was used by more than 100 installations worldwide and 
served as the model for IBM's Mulii-Access Spool, a standard facility of IBM Operating Systems. 
Designed and wrote many enhancements to the WYLBUR text editing system. Developed 
DATASTOR. supporting peer-to-peer communications among computers. 

Staff Programmer, IBM Federal Systems Division 1968 - 1969 

Led the programmers who developed display software for the FAA's Air Traffic Control System. 

Health Services Officer, USPHS, CCB, DCRT, NIH 1966 - 196S 

Diagnosed system problems and effected repair or circumvention; developed modifications to the IBM 
operating system and did utility programming 

Senior Associate Programmer, IBM Federal Systems Division 1962 - 1966 

Performed system and utility programming for Air Force Project 473L; designed and wrote the part of the 
system control program that allowed application programs to allocate and access disk storage. 

Technical Summary 

IBM computers including 3090/200, large IBM computer centers; Unisys A- 15, A- 16; Amdahl 5990-1 100; 
IBM and Macintosh Personal Computers. OS/MVS, TSO, WYLBUR, NIH Shared Spool. 

Local Area Networks (PTOnet and the NIH Campus Network), PL-1, COBOL, IBM S/360 Assembler 
Language, variety of software on Apple Macintosh and IBM PC : Excel, Word, WordPerfect, Pow erPoint, 
Visio, Mac System 9 and X, MacProject, Microsoft Project, Adobe Photoshop. 

Education 

BS Mathematics, Cum Laude., Holy Cross College, 1962. Graduate work in Mathematics, The American 
University, 1969-1970. MBA, Southern Illinois University at Edwardsville, 1978. 



ALGORITHMIC DETERMINATION OF FLANKING DNA 
SEQUENCES THAT CONTROL THE EXPRESSION OF SETS 
OF GENES IN PROKARYOTIC, ARCHEA AND EUKARYOTIC 
GENOMES 

5 

Reference to Related Application 

The present application is the subject of Provisional 
Application Serial No. 60/208,650 filed June 2, 2000 
entitled ALGORITHMIC DETERMINATION OF CONNECTRONS FOR THE 
10 HIGH LEVEL REGULATION OF GENE EXPRESSION. 

Introduction 

RNA introduced into a cell by a virus is now known to 
trigger a cellular defense mechanism known as post- 
transcriptional gene silencing (PTGS). If the viral RNA 

15 sequence matches a sequence within the cell's genome the 
associated genes are turned off or silenced. This 
phenomenon is also called 1 RNA interference' or RNAi . A 
single-stranded RNA can interact with another single- 
stranded RNA (known as antisense RNA) . The single- 

20 stranded RNA can also form a triple-stranded complex with 
double-stranded DNA. This triple-stranded complex is 
known as a Hoogsteen helix. This patent application 
shows how two specific adjacent RNA single-stranded 
sequences (called CI and C2 - for Control Sequence 1 and 

25 Control Sequence 2) interact with two distant double- 
stranded DNA sequences (called Tl and T2 - for Target 
Sequence 1 and Target Sequence 2) to form a tetradic 
relationship which is called a "connectron" . The two 
distant DNA double- stranded sequences (Tl and T2 ) must be 

30 on the same chromosome in a genome and they must be 
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between about lkb and 105kb of each other. The adjacent 
single- stranded RNA sequences (C1/C2) can be on the same 
or different chromosome as the Tl and T2 sequences. The 
CI sequence is identical to the Tl sequence and the C2 
5 sequence is identical to the T2 sequence. The connectron 
acts to stabilize the double-stranded DNA by allowing 
30nm chromatin particles to form. Genes that lie between 
the Tl and T2 sequences when wrapped up in 3 0nm chromatin 
particles are not open to promotion and expression. The 

10 connectron (i.e. the tetradic relationship between the 
T1-T2 sequences and C1/C2 sequences) provides a general 
explanation for PTGS . A connectron can implemented by 
RNA sequences, PNA (Peptide Nucleic Acid) sequences or by 
a zinc-finger DNA Binding Protein (DBP) specific to the 

15 Tl and T2 sequences. 

Characteristically the adjacent C1/C2 sequences lie in 
the 3'UTR of a gene. The Tl and T2 sequences do not lie 
within the translated region of any gene. These 
sequences * surround" one or more genes. There are, 
20 however, Tl and T2 sequence pairs that surround one or 
more C1/C2 sequences that are not 3'UTR to any gene. 
These are called "geneless connectrons" . There may be 
promoter sequences that cause the transcription of these 
3'UTR sequences. 

25 A computer-based algorithm that is similar to the 
algorithm used in the US Patent 6,205,404 has been 
developed to determine the connectron structure of any 
genome. This algorithm determines the existence of all 
the connectrons in the genomic DNA. Connectrons exist in 

30 prokaryotes, archea, single-celled eukaryotes, multi- 
celled eukaryotes, plants and higher animals. Connectron 
relationships exist between prokaryotes and their 
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plasmids. The geneless connectrons provide a possible 
mechanism for forming a hierarchy of gene expression 
control that will produce an understanding of cell 
differentiation and tissue development. 

5 Each connectron is a unique tetrad of sequences. Each 
connectron changes the expression of the genes between 
the Tl and T2 sequences. The CI sequence (which is 
equivalent to the Tl sequence) and the C2 sequence (which 
is equivalent to the T2 sequence) are determined by the 

10 invention described in this patent application. In 
general, the tetrad of connectron sequences can be 
patented because the structure of matter is known and the 
function of specific gene expression modulation is also 
known. Gene expression modification can be produced by 

15 introducing antisense RNA or PNA to interact C1/C2 RNA 
sequences or zinc- finger DBPs to interact with the Tl and 
T2 sequences. Using connectrons it will be possible to 
modify cellular and tissue behavior in a very general 
manner . 

20 Examples will be given from different genomes to 
illustrate that the connectron is a perfectly general and 
universal concept . 

Definitions 

25 

Double stranded DNA - Watson and Crick showed in 19 53 
that DNA naturally forms a double-stranded helix. A 
typical double stranded sequence is 

30 5 ' - TAGAGGAGTACC AC - 3 ' 
3 ' - ATCTCCTCATGGTG- 5 ' 



-3- 



Hydrogen Bond - The force between a hydrogen atom and 
another heavier atom such as Oxygen (0) , Nitrogen (N) , 
Phosphorus (P) , or Sulfur (S) . 

Positive strand - The positive strand is normally 
represented 5' to 3' running left to right as in 

5 ' - TAGAGGAGTACCAC - 3 ' 

Negative strand - The negative strand is normally 
represented 5' to 3' running right to left as in 

3 ' - ATCTCCTCATGGTG- 5 ' 

Single stranded RNA - Either the positive or the negative 
strand of the double- stranded DNA can be transcribed by 
the polymerase. m RNA U replaces T. 

RNA of positive strand sequence 5 ' -UAGAGGAGUACCAC-3 ' 
RNA of negative strand sequence 5 ' -GUGGUACUCCUCUA-3 ' 

Antisense RNA - The antisense strand of any RNA sequence 

is the compliment sequence 

RNA sequence 5 ' -UAGAGGAGUACCAC-3 ' 

Antisense RNA sequence 3 ' -AUCUCCUCAUGGUG-5 ' 

Triple Strand Helix - The RNA sequence of a RNA /DNA 
trxple-strand complex is the same as the positive strand 

of the DNA 



DNA positive strand 



5 ' -TAGAGGAGTACCAC -3 ' 



DNA negative strand 3 ' -ATCTCCTCATGGTG-5 ' 

RNA strand 5 ' -UAGAGGAGUACCAC-3 ' 



Promoter - Any region of DNA, that binds proteins which 
engage the polymerase transcription mechanism. 

TATA Box - A region near the 3' end of a promoter with 
the sequence TATA. 

mRNA - The RNA produced from the DNA by the polymerase as 
a result of transcription 

Start of transcription - The 3' end of a promoter where 
the polymerase mechanism begins to transcribe DNA into 
mRNA. 

Exon - Any region of mRNA which is used to code for 
proteins 

Intron - Any region of mRNA lying between two exons which 
is not used to code for proteins. The introns are edited 
out of the initial RNA transcript to form the mature 
mRNA. 

3' UTR - The untranslated 3' end of an mRNA is beyond the 
end of the last exon. A stop codon in the mRNA causes 
the ribosome to stop the translation of mRNA into 
protein . 

End of translation - The 3' end of the 3 ' -most exon. 
Translated region - Any collection of exons and introns. 



Gene - Any DNA region that codes for a protein. Introns 
do not occur in prokaryotic genes and they sometime fail 
to occur in eukaryotic genes. A typical model of a gene 
is 

5 

I < promoter > | 

t<-TATA Box->{ 

|<-Beginning of Translation 

I <r Translated Region >| 

10 End of Translations ] 

I <-Exon-> I <-Intron-> ! <-Exon-> | <-Intron-> | <-Exon-> | <-3' UTR-> [ 

+ strand 

- strand 

15 (< Gene >| 

Positive strand gene - Any gene in which the features run 
5' to 3' on the positive strand 

20 Negative strand gene - Any gene in which the features run 
5' to 3' on the negative strand 

CI sequence - Any positive or negative strand DNA 
sequence of 20 bases or more. 
25 The C2 sequence must occur in the same chromosome as the 
CI sequence. 

C2 sequence - Any positive or negative strand DNA 
sequence of 20 bases or more. 
30 The CI sequence must occur in the same chromosome as the 
C2 sequence. 

C1/C2 - Any positive or negative strand DNA sequence of 
40 or more bases such that the CI sequence is adjacent to 
35 the C2 sequence 

Tl sequence - Any positive or negative strand DNA 
sequence of 20 bases or more that is on the same 
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chromosome as the T2 sequence. The Tl and T2 sequences 
must be between about lkb and 105kb apart. 

T2 sequence - Any positive or negative strand DNA 
5 sequence of 20 bases or more that is on the same 
chromosome as the Tl sequence. The T2 and Tl sequences 
must be between about lkb and 105kb apart. 

Last exon gap or Gap-Distance - The number of bases 
10 between the end of transcription and the beginning of the 
C1/C2 sequence. In prokaryotes and single-celled 
eukaryotes this gap can range from no bases to 500 bases. 
In multi-celled eukaryotes the gap can be as large as 
10, 000 bases. 

15 

Poly-adenylation signal - A number of Adenosine (A) bases 
are added to the mRNA at the end of the 3 'UTR. 

Possible Connectron - Any set of Tl, T2 and C1/C2 
20 sequences such that the CI sequence is identical to the 
Tl sequence and the C2 sequence is identical to the T2 
sequence. The promoter of some gene causes the mRNA of 
the gene to be expressed. The mRNA is edited to 
eliminate the introns. The whole mRNA including the 
25 3 ' UTR can move about in the cell or the nucleus of the 
cell. The C1/C2 RNA that is part of the 3' UTR moves to 
the Tl and T2 DNA sequences. A triple-stranded complex 
of the DNA and the RNA forms such that the CI sequence 
forms hydrogen bonds with the Tl sequence and the C2 
30 sequence forms hydrogen bonds with the T2 sequence. 
Because the CI sequence is adjacent to the C2 sequence, 
the Tl sequence is brought physically close to the T2 
sequence. This produces a loop of between about lkb and 
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105kb in the DNA. Hi stone proteins reduce the length of 
the DNA by binding 200 bases. Histone/DNA complexes form 
six-fold symmetry chromatin assemblies. The diameter of 
the chromatin assemblies is approximately 30nm. 

Real Connectron - Any Possible Connectron which is within 
the Gap-Distance of some gene 

Homologous connectron - The Tl sequence and the T2 
sequence are on the same chromosome as the C1/C2 sequence 

Heterologous connectron - The Tl sequence and the T2 
sequence are on a chromosome different from chromosome of 
the C1/C2 sequence 

Permanent connectron - Any C1/C2 sequence, which is 3' 
UTR to some gene that is not surrounded by any Tl and T2 
sequence pairs 

Transient connectron - Any C1/C2 sequence, which is 3' 
UTR to some gene that is surrounded by one or more Tl and 
T2 sequence pairs 

Self-limiting connectron - Any C1/C2 sequence which is 
3 'UTR to some gene that is surrounded by the Tl and T2 
sequences such that C1=T1 and C2=T2 

Geneless connectron - Any C1/C2 sequence which is not 
3 'UTR to some gene but is surrounded by some Tl and T2 . 
A promoter may lie 5' to the C1/C2 sequence. 



Bidirectionality of Connectron Excitation - A C1/C2 short 
loop on one strand selects a T1-T2 long loop pair on the 



same or the opposite strand. The C1/C2 short loop has a 
complementary Cl'/C2' sequence on the opposite strand. 
Similarly the T1-T2 long loop pair has a complementary 
long loop pair Tl ' -T2 ' . Wherever a C1/C2, T1-T2 tetrad 
5 exists there is a complementary Cl'/C2', Tl ' -T2 ' tetrad. 
The C1/C2 short loop can be transcribed as a 3'UTR to a 
gene on the same strand. The Cl'/C2' short loop which is 
on the strand opposite to the C1/C2 short loop can also 
can be transcribed as a 3'UTR to a gene on the same 
10 strand. There are four possible models of action 

Tl T2 gene - C1/C2 
+ strand 

- strand 

15 

Tl T2 

+ strand 

- strand 

20 C2/C1 - gene 



+ strand — 

- strand 

25 T2' Tl' C2'/Cl' - gene 

gene - Cl'/C2' 

+ strand 

- strand 

30 T2' Tl' 



Of course, the short loops and the long loops do not have 
to be on the same chromosome . 



35 Hierarchy of connectron action - When a C1/C2 is 
expressed it forms a T1-T2 loop by forming a connectron. 
The C1/C2 sequence does not have to be on the same 
chromosome as the Tl and T2 sequences. This provides a 
way of causing interaction between chromosomes . When the 

40 T1-T2 loop forms, any genes in that loop region which had 
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been expressing C1/C2 sequences in their 3'UTRs, now 
cease expressing the C1/C2 sequences. The connectrons 
formed by these C1/C2 sequences will cease to exist after 
some time thus opening up the genes inside the respective 
5 T1-T2 loops to expression. The hierarchy of connectron 
action is alternates between repression and expression. 
The connectron hierarchies can be of any depth. 

One- to-Many connectron action - One C1/C2 sequence can 
10 form connectrons in many different places on many 
different chromosomes. The only requirement is that 
C1=T1 and C2 = T2 . This makes it possible for one 
expression event to control the expression of many genes 
on different chromosomes. 

15 

Many-to-One connectron action - Cl/C2s that come from 
many different places on many different chromosomes can 
form a connectron for a specific T1-T2 sequence pair. 
The only requirement is that Cl=Tl and C2=T2 . This makes 
20 it possible for many different expression events to 
control the expression of one set of genes on a 
particular chromosome . 

Many-to-Many connectron action - The arrangement of 
25 Cl/C2s and Tl-T2s across chromosomes can form a complex 
web of gene expression control relationships. 

Percentage of the Genome Regulated by Connectrons - Since 
the connectrons for a sequenced genome can be calculated, 
30 the percentage of the genome that is open to connectron 
regulation can be known. 
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Emergent Property - The network of connectrons in any 
genome emerges from a knowledge of the complete DNA 
sequence of the genome. Because both the C1/C2 sequences 
and the T1-T2 sequences can be any place in the genome, 
the whole genomic sequence must be known before all the 
connectrons can be determined. 

Paradigm Shift - For the past fifty years since the 
discovery by Watson and Crick of the double-helical 
nature of DNA, the reigning paradigm for scientific 
discovery has been the study of one gene and its effects 
on the behavior of a cell. The advent of genomic 
sequencing and this invention of connectrons that emerge 
from the whole genome will produce a shift in the way 
scientists view biological systems and the way they 
formulate and execute experiments. The many- to-many 
relationships between the connectrons means that there 
are many ways in which the expression of a set of genes 
can be modulated. The multiplicity of control pathways 
means produces a system stability that makes it possible 
for biological systems to be stable for long periods of 
evolutionary time. The thinking that goes into 

formulating scientific experiments will have to change to 
accommodate the changes in understanding that will be 
induced by the application and extension of this patent 
application. 

Hierarchy of DNA Structuring - The DNA of a cell's genome 
is structured in a hierarchy of six levels. Figures 1, 2 
and 3 have been adapted from The Molecular Biology of the 
Cell by Alberts, Bray, Lewis, Raff, Roberts and Watson 
[third edition pages 354, 345 and 348] . As shown in 
figure 1, the double stranded DNA is level 1. The 



double- stranded DNA is wrapped around histone proteins to 
form a chromatin particle that is level 2 of the 
hierarchy. Level 2 is described as "beads-on-a-string" 
in figure 1. The chromatin particles are packed in a 
5 six- fold symmetry as shown in figure 2a and figure 2b. 
These six-fold assemblies have a diameter of 30 nm. Each 
30 nm assembly contains from 18 (i.e. 6 * 3) to 30 (i.e. 
6 * 5) chromatin particles. The 30 nm assemblies 
aggregate into large loops which range in length from 

10 5,000 bases to 100,000 bases of DNA. The size of these 
large loops as shown in figure 1 is approximately 3 00 nm. 
These large loops constitute level 4 of the structuring 
hierarchy. As shown in figure 1, level 5 of the DNA 
structuring hierarchy many large loops are condensed to 

15 form a structure which is approximately 700 nm in 
diameter- The complete chromosome that constitutes level 
6 of the hierarchy is composed of two very long sections 
of level 5 DNA. 

20 Model of Chromatin Structure - The level 4 structure of 
DNA as shown in figure 1 ranges in length from 5,000 to 
105,000 bases of DNA. Figure 3 shows that proteins are 
thought to connect portions of the long loops formed by 
the 30 nm particles to form a chromosome axis. These 

25 condensed long loops are described as chromomeres in The 
Molecular Biology of the Cell. 



Prior Art 

The chromomere model of DNA structuring was presented by 
30 N. A Resnik, et al. [1] and is based on electron 
microscopic data. There are more recent papers studying 
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a variety of genomes with electron microscopy but no 
equivalent study of chromomeres has been done on a fully 
sequenced genome. 

A recent News Feature in Nature by T. Gura [2] described 
5 the discovery of post- transcriptional gene silencing in 
which viral RNA interacts with the transcribed RNA of the 
cell to silence the expression of genes. This article 
describes experiments in C. elegans and D. megalomaster 
in which RNA that is complementary to mRNA introduced 
10 into a cell. This ^antisense" RNA has the effect of 
turning off the expression of one or more genes. The 
introduced complementary RNA produces an "RNA 
interference' 7 called RNAi . 

Thomas Werner and his colleagues at Genomatix in Munich, 
15 Germany have developed an approach to understanding what 
they call "Matrix Attachment Region" (MAR) . Figure 5 
shows their interpretation of the structure of DNA 
surrounding a gene. The following description of the 
MAR is copied from the Genomatrix web site 

20 

"Matrix Attachment Regions (MARs) MARs are sequence 
regions that are responsible for the attachment of 
genomic DNA to the nuclear matrix or scaffold. 
Transcription absolutely requires anchorage of genomic 
25 DNA to the nuclear matrix. 

Functional features of MARs: 

Anchoring of regulatory elements like promoters and 
30 enhancers to the nuclear matrix. 
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Ensuring long term activity of promoters and 
enhancers in chromatin. 

Insulation, rendering a functional domain 
insensitive to position effects. 

Genomatix is conducting a research project to define and 
detect MARs by computer-analysis." 
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Brief Description of the Objects of the Invention 

An object of the invention is to provide a method of 
identifying DNA sequences that control the expression of 
different collections of genes in a genome comprising, 
detecting selected DNA sequences adjacent to some genes 
excluding exons and introns . 

An object of the invention is to provide a method of 
identifying DNA sequences that control the expression of 
different collections of genes comprising, detecting, by 
computer, one or more pairs of non-adjacent DNA sequences 
to which are bound to two RNA sequences . 

An object of the invention is to provide a method of 
identifying DNA sequences that control the expression of 
different collections of genes in a genome comprising 
detecting changes in connectron behavior in the genome. 

An object of the invention is to provide a method of 
modifying the expression of different gene collections in 
a genome, comprising detecting changes in connectron 
behavior as a result of an exogenous stimulus. 

An object of the invention is to provide a method of 
detecting where and when new genes are being integrated 
into a host genome comprising detecting the connectrons 
in said host genome. 
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An object of the invention is to provide a method of 
detecting the expression effect of different gene 
collections in a given body comprising detecting the back 
and forth flow of connectrons between the chromosomes 
5 thereof . 

An object of the invention is to provide a method of 
modifying a given body comprising modifying the 
connectron organization therein. 

10 

An object of the invention is to provide a method of 
detecting connectron control and target sequences in a 
given genome comprising: 

15 determining the base composition of said genome, 

determining one or more sites of control sequence 
o r gan i z a t i on , and /or 

determining one or more sites of target application. 

20 An object of the invention is to provide a method of 
determining the response of a cell in any tissue to 
changes in the cell's environment and/or genetic 
composition comprising providing a complete genomic DNA 
sequence for the organism and determining the effect of 

25 changes in connectrons due to application of a given 
exogenous stimulus to the gnome. 

An object of the invention is to provide a method of 
determining in prokaryotes, archea, single-celled 
30 eukaryotes and multi-celled eukaryotes, the tetradic 
relationship T1=C1 and T2=C2 where Tl and T2 are DNA 
sequences 2 0 or more bases in length, where the CI 
sequence is adjacent to the C2 sequence, where the Tl and 
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T2 sequences are on the same chromosome, and where the 
C1/C2 sequences are on the same chromosome as Tl and T2 
or where the C1/C2 sequences are on a chromosome 
different from Tl and T2 , wherein: 

5 

CI sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

10 C2 sequence - any positive or negative strand DNA 

sequence of 2 0 bases or more, the CI sequence must 
occur in the same chromosome as the C2 sequence, 

C1/C2 - any positive or negative strand DNA sequence 
15 of 40 or more bases such that the CI sequence is 

adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
20 chromosome as the T2 sequence, the Tl and T2 

sequences must be between about lkb and 105kb apart, 
and 

T2 sequence - any positive or negative strand DNA 
25 sequence of 2 0 bases or more that is on the same 

chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about lkb and 105kb apart. 

An object of the invention is to provide a method of 
30 determining in prokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits many different C1/C2 short 
loops to control the existence of a T1-T2 long loop and 
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wherein said C1/C2 short lops can be on the same 
chromosome or on different chromosomes from the T1-T2 
long loop, wherein: 

5 CI sequence - any positive or negative strand DNA 

sequence of 2 0 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

C2 sequence - any positive or negative strand DNA 
10 sequence of 2 0 bases or more, the Cl sequence must 

occur in the same chromosome as the C2 sequence, 

C1/C2 - any positive or negative strand DNA sequence 
of 40 or more bases such that the Cl sequence is 
15 adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the T2 sequence, the Tl and T2 
20 sequences must be between about lkb and 105kb apart, 

and 

T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
25 chromosome as the Tl sequence, the T2 or Tl 

sequences must be between about lkb and 105kb apart. 

An object of the invention is to provide a method of 
determining in prokaryotes, archea, single-celled 
30 eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits one C1/C2 short loop to control 
the existence of many T1-T2 long loops, the C1/C2 short 



- 18- 



loop can be on the same chromosome or on different 
chromosomes from the T1-T2 long loops, wherein: 

CI sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

C2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more, the CI sequence must 
occur in the same chromosome as the C2 sequence, 

C1/C2 - any positive or negative strand DNA sequence 
of 40 or more bases such that the CI sequence is 
adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the T2 sequence, the Tl and T2 
sequences must be between about lkb and 105kb apart, 
and 

T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about lkb and 105kb apart. 

An object of the invention is to provide a method of 
determining in the connectron relationships between 
prokaryotes and their plasmids wherein said connectrons 
implement a control mechanism between the two genomes 
that makes it possible from them to form a symbiotic 
relationship, and in the case of D. radiodurans the 
relationship is not symmetric, and the D. radiodurans 
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genome sends C1/C2 short loops to the MPl plasmid, 
wherein : 

CI sequence - any positive or negative strand DNA 
sequence of 20 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

C2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more, the CI sequence must 
occur in the same chromosome as the C2 sequence, 

C1/C2 - any positive or negative strand DNA sequence 
of 40 or more bases such that the CI sequence is 
adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the T2 sequence, the Tl and T2 
sequences must be between about lkb and 105kb apart, 
and 

T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about lkb and 105kb apart. 

An object of the invention is to provide a method of 
determining that connectron relationships that exist in 
plant and higher animals. 

An object of the invention is to provide a method of 
determining in prokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes, the connectron 
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relationship that permits one C1/C2 short loop to control 
the existence of one or more T1-T2 long loops without 
being subject to any expression controls other than those 
of the gene to which the C1/C2 is 3'UTR, wherein: 

CI sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

C2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more, the CI sequence must 
occur in the same chromosome as the C2 sequence, 

C1/C2 - any positive or negative strand DNA sequence 
of 540 or more bases such that the CI sequence is 
adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the T2 sequence, the Tl and T2 
sequences must be between about lkb and 105kb apart, 

T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about lkb and 105kb apart, 
and 

3'UTR - untranslated 3' end of an mRNA is beyond the 
end of the last exon, a stop codon in the mRNA 
causes the ribosome to stop the translation of mRNA 
into protein. 



-21 - 



An object of the invention is to provide a method of 
determining in prokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits one C1/C2 short loop to control 
5 the existence of one or more T1-T2 long loops such that 
this C1/C2 short loop is itself subject to expression 
control by another T1-T2 long loop which surrounds it, 
wherein : 

10 CI sequence - any positive or negative strand DNA 

sequence of 2 0 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

C2 sequence - any positive or negative strand DNA 
15 sequence of 2 0 bases or more, the CI sequence must 

occur in the same chromosome as the C2 sequence, 

C1/C2 - any positive or negative strand DNA sequence 
of 540 or more bases such that the CI sequence is 
20 adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the T2 sequence, the Tl and T2 
25 sequences must be between about lkb and 105kb apart, 

and 

T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that/ is on the same 
30 chromosome as the Tl sequence, the T2 or Tl 

sequences must be between about lkb and 105kb apart. 
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An object of the invention is to provide a method of 
determining in prokaryotes, archea, single- eel led 
eukaryotes and multi-celled eukaryotes, the connectron 
relationship that permits one C1/C2 short loop to control 
the existence of the T1-T2 long loop that surrounds it, 
wherein : 

CI sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

C2 sequence - any positive or negative strand DNA 
sequence of 20 bases or more, the CI sequence must 
occur in the same chromosome as the C2 sequence , 

C1/C2 - any positive or negative strand DNA sequence 
of 40 or more bases such that the CI sequence is 
adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the T2 sequence, the Tl and T2 
sequences must be between about lkb and 105kb apart, 
and 

T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about lkb and 105kb apart. 

An object of the invention is to provide a method of 
determining the connectron relationships that do not have 
any genes within the T1-T2 long loop, wherein: 
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Tl sequence is any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the T2 sequence, and 

5 

T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the Tl sequence, and the T2 or Tl 
sequences must be between about lkb and 105kb apart. 

10 

An object of the invention is to provide a method of 
determining the geneless connectron relationship where 
one C1/C2 short loop controls the existence of many 
geneless T1-T2 long loops, wherein: 

15 

CI sequence - any positive or negative strand DNA 
sequence of 20 bases or more, the C2 sequence must 
occur in the same chromosome as the CI sequence, 

20 C2 sequence - any positive or negative strand DNA 

sequence of 20 bases or more, the CI sequence must 
occur in the same chromosome as the C2 sequence, 

C1/C2 - any positive or negative strand DNA sequence 
25 of 40 or more bases such that the CI sequence is 

adjacent to the C2 sequence, 

Tl sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
30 chromosome as the T2 sequence, the Tl and T2 

sequences must be between about lkb and 10 5kb apart, 
and 
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T2 sequence - any positive or negative strand DNA 
sequence of 2 0 bases or more that is on the same 
chromosome as the Tl sequence, the T2 or Tl 
sequences must be between about Ikb and 105kb apart. 
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Description of the Drawings and Tables 

The above and other objects, advantages and features of 
the invention will become more apparent when considered 
with the following specification and accompanying 
drawings and tables wherein: 

Figure 1 DNA is structured in six levels of increasing 
condensation. Double stranded DNA is level 1. 
Two turns of DNA are wrapped about each 
chromatin particle at level 2. The chromatin 
particles which each containing 200 base pairs 
form into 30 nm particles at level 3. The 30 
nm particles form into large loops with an 
approximate dimension of 300 nm at level 4. 
Metaphase chromosomes form a condensed 
structure with an approximate dimension of 7 00 
nm at level 5. An entire metaphase chromosome 
has a width of approximately 1400 nm at level 
6. The large loops at level 4 of the DNA 
structuring are thought to have between 20,000 
(20 kb) and 100,000 (100 kb) base pairs. 

The Molecular Biology of the Cell by Alberts, 
Bray, Lewis, Raff, Roberts and Watson, 3rd. ed. 
, Garland Publishing, Inc., New York, 1994, p. 
354 
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Figure 2 



(a) Chromatin DNA forms into a six-fold 
symmetry 30nm particles. 



(b) The six-fold symmetry 30nm particles form a 
linear chain with a varying number of repeat 
5 units. 

The Molecular Biology of the Cell by Alberts, 
Bray, Lewis, Raff, Roberts and Watson , 3rd. 
ed. , Garland Publishing, Inc., New York, 1994, 
10 p. 345 

Figure 3 Long loops of 3 0nm particles are thought to be 
closed at the bottom of the loop by proteins. 

The Molecular Biology of the Cell by Alberts, 
15 Bray, Lewis, Raff, Roberts and Watson, 3rd. ed. 

, Garland Publishing, Inc., New York, 1994, p. 
348 

Figure 4 (a) Transcription and Editing. (b) Movement of 
the RNA through the Nucleus. (c) Connectron 
20 Formation 

Figure 5 Overview of schematic organization of a typical 
transcriptionally active chromosomal loop. 

Table 1 Connectron Properties for Prokaryotic, Archea 
25 and Eukaryotic Genomes 

Table 2 Yeast Inter-Chromosomal Connectron Distribution 
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Figure 6 Genome size plotted as a log-log function of 
the Number of Connectrons 

Figure 7 Number of Sequence Instances plotted as a 
function of the Number of Fragments 

Figure 8 Level 0 - The overall view of the algorithm 

Figure 9 Level 1 - Process Flow of the Algorithm 

Figure 10 Level 2a - two pages - Process Genome into 
Blocking Fragment File 

Figure 11 Level 2b - three pages - Compute the 
Connectrons for a Genome 

Figure 12 Level 2c - two pages - Analyze Possible 
Connectrons 

Figure 13 Level 3a - Setup Genome Usage Memory- 
Figure 14 Level 3b - Find DBP-Size Blocking File for Tl- 
Window 

Figure 15 Level 1 - Find DBP-Size Blocking File for T2- 
Window 

Figure 16 Level 2a - two pages - Find C1/C2 Entries 

Figure 17 Level 2b - two pages - Scan Genome Usage Memory 
for Potential Connectrons 
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Description of the Xnvention 

A connectron is a relationship among four DNA sequences. 
Each sequence must be at least 20 bases long. There is a 
5 report by Sharp and Zamore [ 3 ] that RNA sequences of 
"about length 25" are important as sources of RNAi . 27 
bases were actually used as the minimum length of each of 
the sequences. The Tl sequence is on one strand of some 
chromosome in a genome. The T2 sequence is on the same 

10 strand of the same chromosome as the Tl sequence. The Tl 
and T2 sequences (which are each at least 20 bases in 
length) must be at least 5,000 bases distant from each 
other but they can not be more than 105,000 bases distant 
from each other. The CI sequence and the C2 sequence 

15 (which are each at least 2 0 bases in length) are adjacent 
to each other on some strand of some chromosome in the 
genome. The C1/C2 sequences - called the "short loop" - 
can be on the same strand as the Tl and T2 sequences or 
they can be on the opposite strand. The C1/C2 sequences 

20 of the short loop can be on the same chromosome as the Tl 
and T2 sequences but they can also be on a different 
chromosome in the genome. When a genome has only one 
chromosome, then the point is moot. Many genomes, of 
course, have several chromosomes. The CI sequence is 

25 identical to the Tl sequence and the C2 sequence is 
identical to the T2 sequence. 

The C1/C2 sequence must be on the same strand as a gene, 
either be directly adjacent to the gene (i.e. a gap of 0 
30 bases) for prokaryotic genomes or at this time be within 
10,000 bases for eukaryotic genomes. The size of the gap 
between the end of the gene and the beginning of the 
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C1/C2 sequence is a variable. The C1/C2 short loop is 
expressed as the 3 ' UTR (Un-Translated Region) of the 
gene. In the case of prokaryotic genes that do not 
normally have introns, the whole mRNA becomes the active 
species for connectron formation. In the case of 
eukaryotic genes, the whole transcript is the active 
species for connectron formation upon editing of the 
transcript to eliminate the introns. The whole 

transcript then can move about in the cytoplasm of 
prokaryotic cells or the nucleus of eukaryotic cells. 
Since the CI sequence is equivalent to the Tl sequence 
and the C2 sequence is equivalent to the T2 sequence, the 
CI RNA can form a Hoogsteen triple-stranded RNA/DNA/DNA 
helix with the double-stranded Tl sequence. Similarly 
the C2 RNA can form a Hoogsteen triple-stranded 
RNA/DNA/DNA helix with the double-stranded T2 sequence. 
Because the CI sequence and the C2 sequence are adjacent 
to each other, the C1/T2 RNA/DNA/DNA Hoogsteen triple 
helix is brought into physical adjacency to the C2/T2 
RNA/DNA/DNA Hoogsteen triple helix. RNA/DNA/DNA hybrid 
helices are the most stable form of triple helix. RNA 
double helices, DNA double helices, RNA triple helices 
and DNA triple helices are all significantly less stable 
than a RNA/double-stranded DNA triple helix. The stable 
physical adjacency of the two triple-stranded Hoogsteen 
helices ensures that the long loop of double-stranded DNA 
between the Tl sequence and the T2 sequence can then be 
structured into 3 0 nm chromatin particles as shown in 
level 4 of figure 1. The genes on either strand of the 
DNA between the Tl sequence and the T2 sequence when they 
are structured into the 3 0 nm chromatin particles are not 
open to promotion and expression. 
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The tetradic relationship between the Tl and T2 sequences 
that form the long loop and the C1/C2 sequences that form 
the short loop are called a connectron. The name 
"connectron" was suggested by J. David Rawn Ph.D. of 
5 Towson University. A connectron is possible if the Tl , 
T2 , CI and C2 sequences exist. A connectron is real if 
the C1/C2 short loop sequence is adjacent to an 
expressible gene. If the expression of the adjacent gene 
is inside one or more Tl - T2 long loops then this 

10 connectron is said to be transient. If the adjacent gene 
is not inside any possible T1-T2 long loop then the 
connectron is said to be permanent. If a connectron is 
inside of a T1-T2 long loop that has the same sequences 
(i.e. Tl is really equal to CI and T2 is really equal to 

15 C2) then the connectron is said to be self -limiting . 
This is true because once the C1/C2 sequence is expressed 
it forms the T1-T2 long loop that then shuts off the 
expression of the gene adjacent to the C1/C2 sequence. 
Self -limiting conectrons can also be called "spike" 

20 connectrons since they generate a short-duration spike of 
the C1/C2 short loop sequence. If a T1-T2 long loop does 
not contain any genes but it contains C1/C2 short loop 
sequences then this type of connectrons is said to be 
geneless. The C1/C2 short loops within a geneless T1-T2 

25 long loop can, of course, control the expression of 
genes . 

The physical existence and lifetimes of the connectrons 
must be proved by molecular biological experimentation. 
30 This physical experimental process, however, is logically 
quite separate from the computational experimentation 
that have been conducted from June of 1999 to May of 
2001. The computational search for the existence of 
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connectrons has been extremely positive. These 
computations have shown that connectrons exist in 
prokaryotes, in archea, between prokaryotes and their 
plasmids, in single-celled eukaryotes, in multi-celled 
5 eukaryotes, in plants, in higher animals and in humans. 
All of these features and properties are described in the 
claims section that follows. 

The connectron invention is very powerful. It depends 

10 only on sequence equivalency. The minimum length of the 
four sequences seems to be about 2 0 bases. In the 
calculations shown in this patent application, 27 bases 
have been used as a minimum. The Nature News Feature [1] 
says that other scientists have found RNA sequences of 

15 length about 25 that have interesting gene silencing 
properties. The Nature article does not give any 
mechanism. Because of my algorithm and its use on a 
variety of genomes, this patent application provides the 
computational proof that a particular mechanism is highly 

20 probable. The connectron invention provides an 

explanation for how communication occurs with a 
chromosome as well as between chromosomes in genomes that 
have more than one chromosome. Since each T1-T2 long 
loop can contain one or more genes, the connectron 

25 invention provides a mechanism for turning on and turning 
off sets of genes simultaneously. In time, the 
connectron invention will provide an explanation for how 
differentiation of how one cell's behavior differs from 
the behavior of another adjacent cell. It is already 

30 clear from the computational experiments that have been 
made on S. cervesiae, C. elegans and D. megalomaster that 
the number of geneless connectrons increases dramatically 
as evolution proceeds from single-celled eukaryotes (i.e. 
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S. cervesiae) to 1,000 cell eukaryotes (i.e. C. elegans) 
to visible creatures (i.e. D. megalomas ter ) . The 
extension of this evolutionary progress to plants (i.e. 
A. thaliania) for which only three chromosomes are 
5 sequenced and humans (i.e. H. sapiens) for which only one 
chromosome is completely sequenced. Although the complete 
human genome was published in Nature and Science in 
February of 2001, the NIH-sponsored genomic sequencing 
results are available for about 1/3 of the bases in the 

10 whole genome. The human genomic sequence determined by 
Celera Genomics, Inc. is available only by subscription. 
Table 1 shows how the genome size, the number of genes, 
the number of gene-containing and geneless connectrons 
and the percentage of genes controlled are related in 

15 many different genomes. 

The C1/C2 short loops originate on one chromosome. The 
T1-T2 long loops can be on the same or different 
chromosomes. Table 2 which is for yeast (S. cervesiae) 

20 is a square matrix of how many C1/C2 short loops on a 
given chromosome are sent to form T1-T2 long loops on 
other chromosomes. The diagonal of this matrix shows 
that many chromosomes send connectrons to themselves. 
The striking feature of this particular table is that 

25 chromosome 6 only sends connectrons to chromosome 12 but 
that it receives connectrons from chromosomes 
4,5,7,10,12,13,15 and 16. 

Any tetrad of connectron sequences (i.e. the Tl, T2 , CI 
30 and C2 sequences) as well as the fact of the adjacency of 
the C1/C2 short loop sequence to the transcribing gene 
can be patented because the content of matter and the 
utility can be exactly described. The utility of a 
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connectron is that the T1-T2 long loop shuts off the 
expression of the genes that lie between the Tl sequence 
and the T2 sequence. In the case of geneless 
connectrons, the utility is of a higher level in that the 
5 C1/C2 short loops contained in the higher-level geneless 
T1-T2 long loop, eventually form other lower-level T1-T2 
long loops around a set of genes. 

The invention of connectrons comes at a particularly 
10 important time in biological discovery. The geneless 
connectrons make a many-to-many hierarchical control 
mechanism possible. It is already clear from the 
determination of the conectrons for C. elegans and D. 
megalomaster that there are as many or more geneless 
15 connectrons than there are genes. It has been clear for 
some time that the number of genes in a genome is not 
particularly correlated with the size of the genome. 
Figure 6 shows that the size of a genome is roughly 
linearly correlated with the number of connectrons. 

20 

The connectron invention can be used to generate a model 
of behavior in any cell. The simulation of connectron 
behavior in different genomes will be the subject of 
another patent application. 

25 

The connectron invention provides for a rational 
exploitation of the information contained in the raw 
genomic DNA sequence by forming a hierarchy of 
relationships between geneless connectrons, transient 
30 connectrons, permanent connectrons, self -limiting 
connectrons and the expression of genes. 
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THIS SHEET INTENTIONALLY LEFT BLANK. 
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Detailed Description of the Invention 

The algorithm for the determination of connectrons in any 
genome or any genome fragment is represented in the 
5 following flow diagrams. The Level 0 diagram in figure 8 
shows the general relationships in a digital computer. 
The central processor of the digital computer uses the 
computer program to take genome descriptors, the genomic 
DNA sequences and the tables of gene features to produce 

10 a file of blocking fragments and a file of the optimal 
connectrons for the genome. The printer serves to make 
hard copies of the files and this patent application. 
The level 1 diagram in figure 9 shows the three essential 
steps in the determination of connectrons. The genome is 

15 first processed into a blocking fragment file. Then the 
blocking fragments are used to compute the connectrons 
for the genome. Finally the potential connectrons are 
analyzed to determine if the C1/C2 sequences are in the 
3'UTR of a gene. The level 2a diagram in figure 10 shows 

20 the steps required for the processing of the genome into 
a file of blocking fragments. The genomic DNA sequence 
is decomposed into 27 -base frames for both the positive 
and negative strands. These fragments are written to the 
unsorted fragment file. The fragment file is then sorted 

25 is then read and formed into groups of equivalent 
sequences. The ( .blk) file contains the sequence and a 
pointer to the (.gptr) file which contains the pointers 
to the position of the fragments in the genomes. The 
position in the genome includes the chromosome number, 

30 the position in the chromosome and the strand (i.e. 
positive and negative) . A sample of these files follows 
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Sample of the ( .blk) file for S. cervesiae 



27-base fragment Number Pointer 

of instances to 

5 ( . gptr) file 





111111111111111111111111111 


0 


1 




111111123244233313 332443414 


1 


2 




111111141113443133314333341 


2 


4 


10 


111111232442333133324434141 


1 


5 




111111323311133323144423444 


2 


7 




111111332213331341414443413 


2 


9 




111111333444112343412323243 


1 


10 




111111333444113343412323243 


9 


19 


15 


111111411134431333143 333414 


2 


21 




111111443223134142124434124 


2 


23 




111112223234344444443144442 


2 


25 




11111224412 3441122214421213 


8 


33 




111112311241114344 334134431 


2 


35 


20 


111112324423331333244341414 


1 


36 




111112344232231344242234342 


1 


37 




111112433444244421144134211 


1 


38 




111112444311313442 332142224 


1 


39 




111113131241131114424413231 


1 


40 


25 


111113143332344311113133411 


1 


41 




111113233111333231444234441 


2 


43 



In fragments above 1=G, 2=C, 3=A, 4=T 

30 Sample of the (.gptr) file for S. cervesiae 

There are 16 chromosomes in S. cervesiae 

Item Chromosome Position Direction 
35 in Chromosome 



1 


0 


0 


0 


2 


4 


11137 


1 


3 


12 


467619 


1 


4 


12 


458482 


1 


5 


4 


11138 


1 


6 


12 


465759 


2 


7 


12 


456622 


1 


8 


1 


219366 


1 


9 


8 


539978 


1 


10 


14 


522451 


1 


11 


4 


1099073 


1 


12 


4 


1210003 


1 


13 


7 


539068 


1 
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14 


12 


654136 


1 




15 


12 


596455 


1 




16 


15 


121016 


1 




17 


15 


598127 


2 


5 


18 


16 


847724 


1 




19 


16 


59765 


1 




20 


12 


467620 


1 




21 


12 


458483 


1 




22 


12 


461657 


1 


10 


23 


12 


452520 


1 




24 


13 


838006 


1 




25 


15 


288270 


1 




26 


4 


83593 


1 




27 


4 


992867 


1 


15 


28 


6 


162265 


1 




29 


7 


845687 


1 




30 


10 


531560 


2 




31 


15 


282208 


1 




32 


16 


860418 


1 


20 


33 


16 


572308 


1 




34 


12 


465992 


1 




35 


12 


456855 


1 




36 


4 


11139 


1 




37 


8 


89343 


1 


25 


38 


4 


10302 


1 




39 


1 


19894 


2 




40 


16 


9311 


1 




41 


10 


735203 


1 




42 


12 


465760 


1 


30 


43 


12 


456623 


1 



In direction column above l=positive strand, 
2=negative strand 

35 

The level 2b diagram in figure 11 shows the computation 
of the connectrons. The genome descriptors consist of 
the number and length of the chromosomes. The algorithm 
uses an array that represents several facts about each 

40 base position in the genome. The level 3a diagram in 
figure 13 shows the setup of the Genome-Usage memory. 
The gene features are used to prevent the region of the 
genome that codes for proteins from being used for the 
connectron sequences (i.e. the Tls, the T2s, the Cls and 

45 the C2s) . In the level 2a diagram of figure 10, the 
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algorithm steps through each chromosome and within each 
chromosome through each base position looking for 
acceptable Tl-windows of 27 bases. A Tl-window can be 
used to form a connectron relationship if there are two 
5 or more instances of this fragment in the blocking 
fragment file. The computation in the level 3b diagram 
of figure 14 determines if the Tl-window is acceptable of 
not. Once an acceptable Tl-window is found, the 
algorithm (in the level 2a diagram of figure 10) looks 

10 for acceptable T2 -window positions that lie between 5,000 
and 105,000 bases from the Tl-window. The computation 
for determining acceptable T2-window positions is done in 
the level 3c diagram of figure 15. Once a pair of Tl and 
T2 window positions are found, the algorithm looks among 

15 the instances of these Tl and T2 sequences for a pair of 
sequences CI and C2 that lie within 200 bases of each 
other on the same chromosome. The computation for 
determining acceptable C1/C2 windows is shown in the 
level 3d diagram in figure 16. In the level 3e diagram 

20 of figure 17 the Genome-Usage memory is scanned for the 
Possible-Connectrons . In the level 2c diagram of figure 
12 the Possible-Connectrons are scanned to determine if 
the C1/C2 sequences are within the Gap-Distance of a gene 
on either the positive or the negative strands. The 

25 Real-Connectrons are then written out in several 
different files including the descriptions in the claims 
section . 
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Examples 

The algorithm for the determination of optimal 
connectrons has been applied to a number of different 
publicly available genomes. The connectron is a tetradic 
relationship between four sequence elements - Tl, T2 , CI 
and C2 . The claims presented in this section are written 
by the program NearGene that implements the flow diagram 
Level 2c of figure 12. The examples are written a 
uniform type of English. Each example contains some or 
all of the following elements 



Name of genome 
Description of Tl 
Length of T1-T2 loop 

The chromosome on which the T1-T2 loop exists 

The identifier number within the genome of the Tl 

sequence 

The Tl sequence 

Description of T2 

The identifier number within the genome of the T2 

sequence 

The T2 sequence 

A list of genes whose expression is controlled by 
the T1-T2 loop 

The common names of the genes as obtained from the 
NCBI gene feature file ( .ptt) 

A list of C1/C2 short loops whose expression if 
controlled by the T1-T2 loop 

The chromosome on which the C1/C2 short loop exists 
The common name of the gene which expresses the 
C1/C2 short loop as an RNA 
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The sequence of the C1/C2 short loop 

A list of C1/C2 short loops that control the 
formation of the T1-T2 loop 

The chromosome on which the C1/C2 short loop exists 
5 The common name of the gene which expresses the 

C1/C2 short loop as an RNA 
The sequence of the C1/C2 short loop 

The match between the C1/C2 sequence and the Tl 
sequence 

10 The match between the C1/C2 sequence and the T2 

sequence 



The uniform descriptions make it possible to rapidly 
comprehend the specifics in each example. 

15 When a sequence element is very long a series of four 
dots has been inserted between the beginning and ending 
sequence groups. A variable number of bases have been 
deleted. 

20 
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Index for Connectron Samples 

Connectrons occur in prokaryotes, archea, single- 
celled eukaryotes and multi-celled eukaryotes . 

Many Connectrons control the expression of one set 
of genes in prokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes. 

One connectron controls the expression of many sets 
of genes in prokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes. 

Connectrons occur between prokaryotes and their 
plasmids . 

Connectrons occur in plants and higher animals 

Permanent connectrons exist in prokaryotes, archea, 
s ingle- eel led eukaryotes and multi-celled 
eukaryotes . 

Transient connectrons exist in prokaryotes , archea, 
single-celled eukaryotes and multi-celled 
eukaryotes . 

Self -limiting connectrons occur in prokaryotes, 
archea, single-celled eukaryotes and multi-celled 
eukaryotes 

Geneless connectrons exist in single-celled and 
multi-celled eukaryotes 
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10. One connectron controls many geneless connectrons 
in single-celled and multi-celled eukaryotes 
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1. Connectrons occur in prokaryotes, arch a, 
single-celled eukaryotes and multi-celled 
eukaryotes . 

Connectrons exist as tetradic relationships where the 
sequence Tl is equivalent to the sequence CI (written 
T1=C1) and where the sequence T2 equals the sequence C2 
(written T2=C2) where Tl and T2 are DNA sequences 20 or 
more bases in length, where the CI sequence is adjacent 
to the C2 sequence, where the Tl and T2 sequences are on 
the same chromosome, and where the C1/C2 sequences are on 
the same chromosome as Tl and T2 or where the C1/C2 
sequences are on a chromosome different from Tl and T2 . 
The connectron relationship has been found to exist in 
prokaryotes, archea, single-celled eukaryotes and multi- 
celled eukaryotes. 

Example of a prokaryote connectron - E. coli 

In this example the existence of the T1-T2 (3197-3308) 
long loop is controlled by three C1/C2 short loops (3307, 
3432 and 2218). The T1-T2 long loop controls the 
expression of 64 genes on chromosome 1 in addition to six 
C1/C2 (3204, 3206, 3223, 3228, 3301 and 3327) short 
loops. The C1/C2 short loop 3327 lies outside the range 
of the T1-T2 long loop (3197-3308) but this C1/C2 is 
expressed as a 3'UTR to the gene hemG that is within the 
range of the T1-T2 long loop. 



33 07 Chromosome 1 
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3432 Chromosome 1 
2218 Chromosome 1 



* 



* 



5 



Chromo s ome 1 



3197 



3308 



3204 
3224 
3301 



3206 
3228 
3327 



10 



Connectron control elements for chromosome 1 of the E. 
coli genome 



A double stranded DNA loop of length 93.542 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3197. This Tl control element has 
the DNA sequence 

20 

Seq. Id. = 1 Position = 1 to 175 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCG 
25 GGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCG 
GGAA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3308. This T2 
30 control element has the DNA sequence 

Seq. Id. = 2 Position = 1 to 175 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGG 
35 AACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGA 



15 



-45- 



AAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATGCACACCCCGCGC 
CGCT 

This long T1/T2 double stranded DNA loop modulates the 
5 expression of the following genes 



rrsC 


gitu 


rrlC 


rrfC 


aspT 


trpT 


yif A 


yifE 


yifB 


ilvL 


ilvG_l 


ilvM 


ilvE 


ilvD 


ilvA 


ilvY 


ilvC 


ppiC 


b3776 


rep 


gppA 


rhlB 


trxA 


rhoL 


rho 


rfe 


wzzE 


wecB 


rf fH 


wecD 


wecE 


wzxE 


yifM_2 


wecG 


yifK 


argX 


hisR 


leuT 


proM 


aslB 


aslA 


hemY 


hemX 


hemD 


cyaA 


cyaY 


b3808 


dapF 


uvrD 


b3814 


corA 


yigF 


yigG 


rarD 


yigl 


pldA 


recQ 


yigj 


yigK 


pldB 


yigL 


yigM 


metR 


metE 


ysgA 


udp 


yigN 


ubiE 


yigP 


b3836 


yigU 


yigW_l 


rfaH 


yigC 


ubiB 


fadA 


fadB 


pepQ 


trkH 


hemG 



This long T1/T2 double stranded DNA loop modulates the 
25 expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 
3204 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
30 expressed as a RNA single strand that is 3 * UTR to the 
gene rrsC and has the DNA sequence 



Seq. Id. = 3 Position = 1 to 186 
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GATGTGCCCAGATGGGATTAGCTAGTAGGTGGGGTAACGGCTCACCTAGGCGACGAT 
CCCTAGCTGGTCTGAGAGGATGACCAGCCACACTGGAACTGAGACACGGTCCAGACT 
CCTACGGGAGGCAGCAGTGGGGAATATTGCACAATGGGCGCAAGCCTGATGCAGCCA 
5 TGCCGCGTGTATGAA 

A C1/C2 short loop on chromosome 1 whose identifier is 
3206 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
10 expressed as a RNA single strand that is 3 ' UTR to the 
gene rrsC and has the DNA sequence 

Seq. Id. = 4 Position = 1 to 186 

15 GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCG 
AATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCACCTGCCTTAATAT 
C TC AAAAC TC ATC TTC GGGTG ATGTTTG AG AT ATTTGC TC TTT AAAAATC TGG ATC A 
AGC TG AAAATTG AAA 

20 A C1/C2 short loop on chromosome 1 whose identifier is 
3223 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 1 UTR to the 
gene rrlC and has the DNA sequence 

25 

Seq. Id. = 5 Position = 1 to 186 

GCTGAAGTAGGTCCCAAGGGTATGGCTGTTCGCCATTTAAAGTGGTACGCGAGCTGG 
GTTTAGAACGTCGTGAGACAGTTCGGTCCCTATCTGCCGTGGGCGCTGGAGAACTGA 
30 GGGGGGCTGCTCCTAGTACGAGAGGACCGGAGTGGACGCATCACTGGTGTTCGGGTT 
GTC ATGC C AATGGC A 
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A C1/C2 short loop on chromosome 1 whose identifier is 
322 5 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
5 gene rrlC and has the DNA sequence 

Seq. Id. = 6 Position = 1 to 144 

AAACAGAATTTGCCTGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAAC 
10 TCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTA 
GGG AAC TGC C AGGC ATC AAATT AAGC AGT A 

A C1/C2 short loop on chromosome 1 whose identifier is 
322 8 controls the expression of the genes of one or more 
15 other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene rrfC and has the DNA sequence 

Seq. Id. = 7 Position = 1 to 112 

20 

GGTCATAAAACCGGTGGTTGTAAAAGAATTCGGTGGAGCGGTAGTTCAGTCGGTTAG 
AATACCTGCCTGTCACGCAGGGGGTCGCGGGTTCGAGTCCCGTCCGTTCCGCCAC 

A C1/C2 short loop on chromosome 1 whose identifier is 
25 3 3 01 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 1 UTR to the 
gene ubiB and has the DNA sequence 

30 Seq. Id. = 8 Position = 1 to 57 

TTATCGTGCCTACAAATAGTCCGAACCGTAGGCCGGATAAGGCGTTTACGCCGCATC 
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A C1/C2 short loop on chromosome 1 whose identifier is 
33 07 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
5 gene fadA and has the DNA sequence 

Seq. Id. = 9 Position = 1 to 56 

TGCCGGATGCGGCGTAAACGCCTTATCCGGCCTACGGTTCGGACTATTTGTAGGCA 

10 

A C1/C2 short loop on chromosome 1 whose identifier is 
3 327 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
15 gene hemG and has the DNA sequence 

Seq. Id. = 10 Position = 1 to 347 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
20 CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCG 
GGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCG 
GGAAGGCGTATTATG . . . CCCGTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGT 
AGCTTAACCTTCGGGAGGGCGCTTACCACTTTGTGATTCATGACTGGGGTGAAGTCG 
TAACAAGGTAACCGTAGGGGAACCTGCGGTTGGATCACCTCCTTACCTTAAAGAAGC 
25 GTTCTTTG 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

30 A C1/C2 short loop on chromosome 1 whose identifier is 
3307 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
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single strand that is 3 ' UTR to the gene hemG and has the 
DNA sequence 

Seq. Id. =11 Position = 1 to 347 

5 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCG 
GGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCG 
GGAAGGCGTATTATG . . . CCCGTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGT 
10 AGCTTAACCTTCGGGAGGGCGCTTACCACTTTGTGATTCATGACTGGGGTGAAGTCG 
TAACAAGGTAACCGTAGGGGAACCTGCGGTTGGATCACCTCCTTACCTTAAAGAAGC 
GTTCTTTG 

The match between the Tl sequence and the C1/C2 sequence 
15 is 

Seq. Id. = 11 Position = 1 to 175 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
20 CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCG 
GGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCG 
GGAA 

The match between the T2 sequence and the C1/C2 sequence 
25 is 

Seq. Id. = 11 Position = 28 to 202 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGG 
30 AACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGA 
AAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATGCACACCCCGCGC 
CGCT 
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A C1/C2 short loop on chromosome 1 whose identifier is 
3432 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJSTA 
single strand that is 3'UTR to the gene btuB and has the 
5 DNA sequence 

Seq. Id. = 12 Position = 1 to 335 

TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTAT 
10 AATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTC 
TCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGG 
CGTATTATGCACACC . . . ACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTA 
AC C TTC GGG AGGGC GC TT AC C AC TTTGTG ATTC ATG AC TGGGGTG AAGTC GT AAC AA 
GGTAACCGTAGGGGAACCTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGT 

15 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 12 Position = 1 to 169 

20 

TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTAT 
AATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTC 
TCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAA 

25 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 12 Position = 22 to 196 

30 TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGG 
AACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGA 
AAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATGCACACCCCGCGC 
CGCT 
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A C1/C2 short loop on chromosome 1 whose identifier is 
2218 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
5 single strand that is 3 ' UTR to the gene clpB and has the 
DNA sequence 

Seq. Id. = 13 Position = 1 to 72 

10 CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCA 
AACACGCCGCCGGGC 

The match between the Tl sequence and the C1/C2 sequence 
is 

15 

Seq. Id. = 13 Position = 1 to 72 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCA 
AACACGCCGCCGGGC 

20 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 13 Position = 1 to 71 

25 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCA 
AACACGCCGCCGGG 



30 

Example of an archea connectron - H. pylori 
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In this example the existence of the T1-T2 (812-882) 
long loop is controlled by three C1/C2 short loops (881, 
813 and 1214) . The T1-T2 long loop controls the 
expression of 54 genes on chromosome 1 in addition to one 
C1/C2 (843) short loop. 

881 Chromosome 1 
813 Chromosome 1 
1241 Chromosome 1 

I 

* * 

| Chromosome 1 

812 

I 842 



88 



Connectron control elements for chromosome 1 of H. pylori 
genome 

A double stranded DNA loop of length 96.385 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 812. This Tl control element has the 
DNA sequence 

Seq. Id. = 14 Position = 1 to 43 
TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 882. This T2 
control element has the DNA sequence 

Seq. Id. = 15 Position = 1 to 43 
TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 
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This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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This long 


T1/T2 double 


stranded DNA loop modulates the 


expression 


of the following C1/C2 short loops 




A C1/C2 short loop 


on 


chromosome 


1 whose identifier is 


813 controls the expression of the 


genes of 


one or more 


other T1/T2 long 


loops. This 


C1/C2 short loop is 


expressed 


as a RNA 


single strand 


that is 3'UTR to the 


gene HP099 8 and has 


the 


DNA sequence 




Seq. Id. = 


16 Position 


= 1 to 70 







TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCCAAACAC 
TAAAGATATTTGG 
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The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
881 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene HP109 6 and has the DNA sequence 

Seq. Id. = 17 Position = 1 to 70 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCCAAACAC 
TAAAGATATTTGG 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 17 Position = 1 to 36 
TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 17 Position = 28 to 70 
TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
813 controls the expression of the genes in this T1/T2 
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long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene HP0998 and has 
the DNA sequence 

5 Seq. Id. = 18 Position = 1 to 70 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCCAAACAC 
TAAAGATATTTGG 

10 A C1/C2 short loop on chromosome 1 whose identifier is 
881 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene HP1096 and has 
the DNA sequence 

15 

Seq. Id. = 19 Position = 1 to 70 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCCAAACAC 
TAAAGATATTTGG 

20 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 19 Position = 1 to 43 

25 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

The match between the T2 sequence and the C1/C2 sequence 
is 

30 

Seq. Id. = 19 Position = 28 to 70 
TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 
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A C1/C2 short loop on chromosome 1 whose identifier is 
1241 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
5 single strand that is 3 1 UTR to the gene HP1535 and has 
the DNA sequence 

Seq. Id. = 20 Position = 1 to 56 

10 TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCCAAACA 

The match between the Tl sequence and the C1/C2 sequence 
is 

15 Seq. Id. = 20 Position = 1 to 43 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

The match between the T2 sequence and the C1/C2 sequence 
20 is 

Seq. Id. = 20 Position = 28 to 56 
TAGCGGAACTAAAGCATTCATCCCAAACA 



Example of single-celled connectron - S. cervesiae 

30 In this example the existence of the T1-T2 (1352-1416) 
long loop on chromosome 4 is controlled by one CI / C2 
short loop (4213) on chromosome 10. The T1-T2 long loop 
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controls the expression of 34 genes on chromosome 4 in 
addition to one C1/C2 (1356) short loop. 



4213 Chromosome 10 



| Chromosome 4 | 
1352 1416 
| 1356 | 

10 



Connectron control elements for chromosome 1 of S. 
cervesiae genome 

15 

A double stranded DNA loop of length 68.908 kilo-bases on 
chromosome 4 is bounded on the left by a Tl sequence 
whose identifier is 1352. This Tl control element has 
the DNA sequence 

20 

Seq. Id. = 21 Position = 1 to 37 

TTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAA 

25 This double stranded DNA loop is bounded on the right by 
a T2- control element whose identifier is 1416. This T2 
control element has the DNA sequence 

Seq. Id. = 22 Position = 1 to 362 

30 

ATT AG ATC T ATT AC ATT ATGGGTGGT ATGTTGG AAT AAAAATC AAC T ATC ATC T AC T 
AACTAGTATTTACGTTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAA 
TGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTA 
ATAGGATCAATGAATATTAACATATAAAACGATGATAATAATATTTATAGAATTGTG 
35 TAGAATTGCAGATTCCCTTTTATGGATTCCTAAATCCTTGAGGAGAACTTCTAGTAT 
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ATCTACATACCTAATATTATAGCCTTAATCACAATGGAATCCCAACAATTACATCAA 
AATCCACATTCTCTACAGTA 

This long T1/T2 double stranded DNA loop modulates the 
5 expression of the following genes 



15 
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T1/T2 double 


stranded 


DNA loop modulates the 


expression 
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short loops 





20 A C1/C2 short loop on chromosome 4 whose identifier is 
135 6 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene YDR17 OW-A and has the DNA sequence 

25 

Seq. Id. = 23 Position = 1 to 311 

AATCACACTAATCATTCTGATGATGAACTCCCTGGACACCTCCTTCTCGATTCAGGA 
GCATCACGAACCCTTATAAGATCTGCTCATCACATACACTCAGCATCATCTAATCCT 
30 GACATAAACGTAGTTGATGCTCAAAAAAGAAATATACCAATTAACGCTATTGGTGAC 
CTACAATTTCACTTCCAGGACAACACCAAAACATCAATAAAGGTATTGCACACTCCT 
AACATAGCCTATGACTTACTCAGTTTGAATGAATTGGCTGCAGTAGATATCACAGCA 
TGCTTTACCAAAAACGTCTTAGAACG 
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The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 



5 A C1/C2 short loop on chromosome 10 whose identifier is 
4213 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YJR029W and has 
the DNA sequence 

10 

Seq. Id. = 24 Position = 1 to 346 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTATCAACTA 
ATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTAT 
15 GAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAACGCAAGGATTGATAATGTAATAG 
GATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGT 
AGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATA 
TTCTGTATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCA 
ACAT 

20 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 24 Position = 111 to 147 

25 

TTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAA 

The match between the T2 sequence and the C1/C2 sequence 
is 

30 

Seq. Id. = 24 Position = 1 to 38 
ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATC 



-60- 



5 

Example of a multi-celled connectron - C. elegans 

In this example the existence of the T1-T2 (9-138) long 
loop on chromosome 1 is controlled by three C1/C2 short 
10 loops on chromosome 5 (21719, 21949 and 21655) . The Tl- 
T2 long loop controls the expression of four genes on 
chromosome 1 in addition to seven C1/C2 (119, 122, 125, 
130, 132, 134 and 136) short loops. 



15 21719 Chromosome 5 

21949 Chromosome 5 
21655 Chromosome 5 

I 

* * * 

20 | Chromosome 1 | 

95 138 

| 119 122 | 

I 125 130 | 

i 132 134 I 

25 | 13 6 j 



A double stranded DNA loop of length 41.978 kilo-bases on 
30 chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 95. This Tl control element has the 
DNA sequence 

Seq. Id. = 25 Position = 1 to 55 

35 

CAGCACGTTCTTAACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 
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This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 138. This T2 
control element has the DNA sequence 

5 Seq. Id. = 26 Position = 1 to 36 

ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATCA 

This long T1/T2 double stranded DNA loop modulates the 
10 expression of the following genes 

Y73A3A.1 Y73A3A.1 ZC123.3 ZC123.2 

This long T1/T2 double stranded DNA loop modulates the 
15 expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 
119 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
20 expressed as a RJSTA single strand that is 3 ' UTR to the 
gene ZC123.3 and has the DNA sequence 

Seq. Id. = 27 Position = 1 to 69 

25 TTGAGAACTCTGCGTCTCAACTCCCGCATTTTTTGTAGATCTACGTAGATCAAACCG 
AAATGGGACACT 

A C1/C2 short loop on chromosome 1 whose identifier is 
122 controls the expression of the genes of one or more 
30 other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene ZC123.3 and has the DNA sequence 
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Seq. Id. = 28 Position = 1 to 89 

GCACGGGGTTCTGGCCTTCCTCATTGAATTTTTCGCGCTCCATTGACAATCGCCTGC 
CGGACAACGCGTGGGAAAGTCGTGTACTCCAC 

5 

A C1/C2 short loop on chromosome 1 whose identifier is 
125 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 1 UTR to the 
10 gene ZC123.3 and has the DNA sequence 

Seq. Id. = 29 Position = 1 to 89 

ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTCGGCAAAC 
1 5 TCTTTC ATTTC AATTTATGAGGGAAGCC AGAA 

A C1/C2 short loop on chromosome 1 whose identifier is 
13 0 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
20 expressed as a RNA single strand that is 3 ' UTR to the 
gene ZC123.2 and has the DNA sequence 

Seq. Id. = 30 Position = 1 to 121 

25 CTCCCGCATTTTTTGTAGATCTACGTAGATCAAACCGAAATGAGGCACTTTCTGAAT 
CCACGAGCTAGGCTTAAGCTTAGGCTTAAGCTTAGGCCTTTTCTCAGGCTTAGGCTT 
AGGCTTA 

A C1/C2 short loop on chromosome 1 whose identifier is 
30 13 2 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene ZC123.2 and has the DNA sequence 
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Seq. Id. = 31 Position = 1 to 190 

GCTTATGCTTGGGCTTAGGCTTAGGCGTAGGCTTAGGCTTAGGCTTAGGCTTATGCT 
5 TAGACTTAGTCTCACTATCAGTCTTAGGCTTAGGCTTAGACTTAGGCTTAAGCTTAG 
GCTTAAGCTTAGACTTAGGCTTAGGCTTAGGCTTAGGCTTAGGCTTAGGTTTGGGCT 
TAGGCTTAGGCTTAACCTC 

A C1/C2 short loop on chromosome 1 whose identifier is 
10 134 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene ZC123.2 and has the DNA sequence 

15 Seq. Id. = 32 Position = 1 to 133 

TCTGCGTCTTTTCTCCCGCATTTTTTGTAGATCTACGTAGATCAAACCGAAATGAGG 
CACTTTCTGAATCCACGAGCTAGGCTTAAGCTTAGGCTTAAGCTTAGGCCTTTTCTC 
AGGCTTAGGCTTAGGCTTA 

20 

A C1/C2 short loop on chromosome 1 whose identifier is 
13 6 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
25 gene 2C123.2 and has the DNA sequence 

Seq. Id. = 33 Position = 1 to 190 

GCTTATGCTTGGGCTTAGGCTTAGGCGTAGGCTTAGGCTTAGGCTTAGGCTTATGCT 
30 TAGACTTAGTCTCACTATCAGTCTTAGGCTTAGGCTTAGACTTAGGCTTAAGCTTAG 
GCTTAAGCTTAGACTTAGGCTTAGGCTTAGGCTTAGGCTTAGGCTTAGGTTTGGGCT 
TAGGCTTAGGCTTAACCTC 
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The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 
5 21719 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene C39F7.5 and has 
the DNA sequence 

10 Seq. Id. = 34 Position = 1 to 65 

ACGTTCTTAACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGCATTTTT 
TGTAGATC 

15 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 34 Position = 1 to 51 

20 ACGTTCTTAACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 

The match between the T2 sequence and the C1/C2 sequence 
is 

25 Seq. Id. = 34 Position = 31 to 65 
ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATC 

A C1/C2 short loop on chromosome 5 whose identifier is 
30 21949 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene F16B4.4 and has 
the DNA sequence 
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Seq. Id. = 35 Position = 1 to 95 

ACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATCT 
5 ACGTAGATCAAGCCGAAATGAGACACTCTGACACCACG 

The match between the Tl sequence and the C1/C2 sequence 
is 

10 Seq. Id. = 35 Position = 1 to 42 

ACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 

The match between the T2 sequence and the C1/C2 sequence 
15 is 

Seq. Id. = 35 Position =22 to 63 
ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATC 

20 

A C1/C2 short loop on chromosome 5 whose identifier is 
21655 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene C39F7.3 and has 
25 the DNA sequence 

Seq. Id. = 36 Position = 1 to 61 

AACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATC 
30 TACG 

The match between the Tl sequence and the C1/C2 sequence 
is 
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Seq. Id. = 36 Position = 1 to 36 

AACCATGCAAAATCAGTTGAGAACTCTGCGTCTCTTCTCCCGC 

The match between the T2 sequence and the C1/C2 sequenc 
is 

Seq. Id. = 36 Position = 23 to 57 
ACTCTGCGTCTCTTCTCCCGCATTTTTTGTAGATC 
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2. Many Connectrons control the expression of 
one set of genes in prokaryotes, archea, single- 
celled eukaryotes and multi-celled eukaryotes . 

5 Many different C1/C2 short loops can control the 
existence of one T1-T2 long loop. The C1/C2 short loops 
can be on the same chromosome or on different chromosomes 
from the T1-T2 long loop. This relationship is described 
as "many- to-one " . This relationship exists in 

10 prokaryotes, archea, single-celled eukaryotes and multi- 
celled eukaryotes 



Example of a many-to-one connectron in prokaryotes - E. 
coli 

In this example the existence of the T1-T2 (3197-3308) 
long loop is controlled by three C1/C2 short loops (3307, 
3432 and 2218) . 



20 3 3 07 Chromosome 1 

3432 Chromosome 1 
2218 Chromosome 1 



25 | Chromosome 1 | 

3197 3308 



30 A double stranded DNA loop of length 93.542 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3197. This Tl control element has 
the DNA sequence 

35 Seq. Id. = 37 Position = 1 to 175 
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AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCG 
GGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCG 
GGAA 

5 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3308. This T2 
control element has the DNA sequence 

10 Seq. Id. = 38 Position = 1 to 175 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGG 
AACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGA 
AAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATGCACACCCCGCGC 
15 CGCT 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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yigU yigW_l rfaH yigC ubiB 

fadA fadB pepQ trkH hemG 

The expression of genes in this T1/T2 long loop is 
5 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
3307 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
10 single strand that is 3 ' UTR to the gene hemG and has the 
DNA sequence 

Seq. Id. = 39 Position = 1 to 440 

15 AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCG 
GGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCG 
GGAAGGCGTATTATG . . . GGAGTCTGCAACTCGACTCCATGAAGTCGGAATCGCTAG 
TAATCGTGGATCAGAATGCCACGGTGAATACGTTCCCGGGCCTTGTACACACCGCCC 

20 GTCACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTAACCTTCGGGAGGGCG 
CTTACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAAGGTAACCGTAGGGGA 
ACCTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGTTCTTTG 

The match between the Tl sequence and the C1/C2 sequence 
25 is 

Seq. Id. = 39 Position = 1 to 175 

AAAAAATGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACT 
30 CCCTATAATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCG 
GGGTTCTCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCG 
GGAA 
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The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 39 Position = 28 to 192 

5 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGG 
AACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGA 
AAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATGCACACCCCGCGC 
CGCT 

10 

A C1/C2 short loop on chromosome 1 whose identifier is 
3432 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene btuB and has the 
15 DNA sequence 

Seq. Id. = 40 Position = 1 to 335 

TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTAT 
20 AATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTC 
TCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGG 
CGTATTATGCACACC . . . ACACCATGGGAGTGGGTTGCAAAAGAAGTAGGTAGCTTA 
ACCTTCGGGAGGGCGCTTACCACTTTGTGATTCATGACTGGGGTGAAGTCGTAACAA 
GGTAACCGTAGGGGAACCTGCGGTTGGATCACCTCCTTACCTTAAAGAAGCGT 

25 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 40 Position = 1 to 169 

30 

TGCGCGGTCAGAAAATTATTTTAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTAT 
AATGCGCCACCACTGACACGGAACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTC 
TCCTGAGAACTCCGGCAGAGAAAGCAAAAATAAATGCTTGACTCTGTAGCGGGAA 
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The match between the T2 sequence and the C1/C2 sequence 
is 

5 Seq. Id. = 40 Position = 22 to 196 

TAAATTTCCTCTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGG 
AACAACGGCAAACACGCCGCCGGGTCAGCGGGGTTCTCCTGAGAACTCCGGCAGAGA 
AAGCAAAAATAAATGCTTGACTCTGTAGCGGGAAGGCGTATTATGCACACCCCGCGC 
10 CGCT 

A C1/C2 short loop on chromosome 1 whose identifier is 
2218 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
15 single strand that is 3 ' UTR to the gene clpB and has the 
DNA sequence 

Seq. Id. = 41 Position = 1 to 72 

20 CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCA 
AACACGCCGCCGGGC 

The match between the Tl sequence and the C1/C2 sequence 
is 

25 

Seq. Id. = 41 Position = 1 to 72 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCA 
AACACGCCGCCGGGC 

30 

The match between the T2 sequence and the C1/C2 sequence 
is 
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Seq. Id. = 41 Position = 1 to 72 

CTTGTCAGGCCGGAATAACTCCCTATAATGCGCCACCACTGACACGGAACAACGGCA 
AACACGCCGCCGGGC 

5 



Example of a many-to-one connectron in archea - M. 
jannaschii 

In this example the existence of the T1-T2 (1630-1643) 
long loop is controlled by four C1/C2 short loops (1629, 
1642, 124 and 1533) . 



15 162 9 Chromosome 1 

1642 Chromosome 1 
1*24 Chromosome 1 
153 3 Chromosome 1 

i 

20 * * * 

| Chromosome 1 | 

1630 1643 



A double stranded DNA loop of length 4.998 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 1630. This Tl control element has 
the DNA sequence 

30 

Seq. Id. = 42 Position = 1 to 175 



TTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTA 
TTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAATTAAGATTAATTAG 
35 GAAAGGAAATAAGATTTCTCTAACAGACAAGTTAAATTTTTGGATTTAAAAAGATAA 
AAAT 
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This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1643. This T2 
control element has the DNA sequence 

5 

Seq. Id. = 43 Position = 1 to 175 

TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTA 
TTC AG ATTTTT AAAAATTAGG ATT AATTAGGC AAGTAAATAAAATTTC TC TAAC AAA 
1 0 TAAGTTAAATTTTTGG ATTT AAAAAG ATAAAAAT ACTCTGTTTTATT ATGG AAAGAA 
AGAT 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

15 

MJ1597 MJ1598 MJ1599 MJ1600 MJ1601 

MJ1602 

The expression of genes in this T1/T2 long loop is 
20 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
1629 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
25 single strand that is 3 ' UTR to the gene MJ1597 and has 
the DNA sequence 

Seq. Id. = 44 Position = 1 to 139 

30 ATATGTTTGAAATTTGAAAATAAGAGTATTTAGAAGTTATTAATTAGTTCAAAGGAT 
TTTT ATTTAATTTC T AAGGGTTTGC TGGTTTG ATTATTT AG AATATTTG AGTTTATT 
GAATTATTCAGATTTTTAAAAATTA 
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The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 44 Position = 37 to 139 

5 

TTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTA 
TTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAATTA 

The match between the T2 sequence and the C1/C2 sequence 
10 is 

Seq. Id. = 44 Position = 81 to 139 

GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAAT 
15 TA 

A C1/C2 short loop on chromosome 1 whose identifier is 
1642 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
20 single strand that is 3 ' UTR to the gene MJ1602 and has 
the DNA sequence 

Seq. Id. = 45 Position = 1 to 177 

25 ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAATTTCTCTAACA 
AATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCTGTTTTATTATGGAAAG 
AAAGAT 

30 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 45 Position = 20 to 78 
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GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAAT 
TA 

5 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 45 Position = 3 to 177 

10 TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTA 
TTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAATTTCTCTAACAAA 
TAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCTGTTTTATTATGGAAAGAA 
AGAT 

A C1/C2 short loop on chromosome 1 whose identifier is 
124 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJKTA 
single strand that is 3'UTR to the gene MJ0112 and has 
the DNA sequence 

Seq. Id. = 46 Position = 1 to 75 

ATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAT 
25 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 46 Position = 1 to 75 

30 

ATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAT 



15 



20 
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The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 46 Position = 20 to 75 

5 

GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAAT 

A C1/C2 short loop on chromosome 1 whose identifier is 
1533 controls the expression of the genes in this T1/T2 
10 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene MJ1486 and has 
the DNA sequence 

Seq. Id. = 47 Position = 1 to 58 

15 

TTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
T 

The match between the Tl sequence and the C1/C2 sequence 
20 is 

Seq. Id. = 47 Position = 1 to 58 

TTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAGTTTAT 
25 T 

The match between the T2 sequence and the C1/C2 sequence 
is 

30 Seq. Id. = 47 Position = 25 to 58 
GCTGGTTTGATTATTTAGAATATTTGAGTTTATT 



-77- 



Example of a many- to-one connectron in single-cell 
eukaryotes - S. cervesiae 

5 

In this example the existence of the T1-T2 (5515-5533) 
long loop on chromosome 12 is controlled by seventeen 
C1/C2 short loops (5516, 5532, 1939, 2323, 1942, 3286, 
3649, 4764, 4751, 5536, 6102, 8023, 7356, 3293, 3291, 
10 3289 and 146) . 



15 



20 



25 



5516 


Chromosome 


12 


5532 


Chromosome 


12 


1939 


Chromosome 


4 


2323 


Chromosome 


5 


1942 


Chromosome 


5 


3286 


Chromosome 


7 


3649 


Chromosome 


8 


4764 


Chromosome 


12 


4751 


Chromosome 


12 


5536 


Chromosome 


13 


6102 


Chromosome 


14 


8023 


Chromosome 


16 


7356 


Chromosome 


16 


3293 


Chromosome 


8 


3291 


Chromosome 


8 


3289 


Chromosome 


8 


146 


Chromosome 


2 



30 * * * 

| Chromosome 12 | 

3197 3308 



A double stranded DNA loop of length 6.466 kilo-bases on 
chromosome 12 is bounded on the left by a Tl sequence 
whose identifier is 5515. This Tl control element has 
the DNA sequence 

Seq. Id. = 48 Position = 1 to 225 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
5 AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 5533. This T2 
control element has the DNA sequence 

10 

Seq. Id. = 49 Position = 1 to 225 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
1 5 GAGAGAC AAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAAC ATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

20 

YLR4 67W 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

25 

A C1/C2 short loop on chromosome 12 whose identifier is 
5516 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
30 gene YLR464W and has the DNA sequence 

Seq. Id. = 50 Position = 1 to 252 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
5 ATC CGGGT AAG AGAC AAC AGGGC T 

A C1/C2 short loop on chromosome 12 whose identifier is 
5532 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
10 expressed as a RNA single strand that is 3 ' UTR to the 
gene YLR467W and has the DNA sequence 

Seq. Id. = 51 Position = 1 to 252 

15 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

20 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 
25 1939 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJSTA 
single strand that is 3 ' UTR to the gene YDR545W and has 
the DNA sequence 

30 Seq. Id. = 52 Position = 1 to 222 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
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ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGG 

The match between the Tl sequence and the C1/C2 sequence 
5 is 

Seq. Id. = 52 Position = 1 to 222 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
10 AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGG 

The match between the T2 sequence and the C1/C2 sequence 
15 is 

Seq. Id. = 52 Position = 28 to 222 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
20 GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGG 

A C1/C2 short loop on chromosome 5 whose identifier is 
25 2323 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJSJA 
single strand that is 3 ' UTR to the gene YER189W and has 
the DNA sequence 

30 Seq. Id. = 53 Position = 1 to 252 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
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ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

5 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 53 Position = 1 to 225 

10 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AG AC AATC T AT AAAAAGT AAAC AT AAAAT AAAGGT AGT AAGT AGC TTTTGGTTG 

15 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 53 Position = 28 to 252 

20 ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

25 A C1/C2 short loop on chromosome 5 whose identifier is 
1942 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YEL077C and has 
the DNA sequence 

30 

Seq. Id. = 54 Position = 1 to 252 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
5 ATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence 
is 

10 Seq. Id. = 54 Position = 1 to 225 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
1 5 AGAC AATCTATAAAAAGTAAAC ATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
is 

20 Seq. Id. = 54 Position = 28 to 252 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
25 ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 7 whose identifier is 
3286 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
30 single strand that is 3 ' UTR to the gene YGR296W and has 
the DNA sequence 

Seq. Id. = 55 Position = 1 to 252 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
5 AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence 
is 

10 

Seq. Id. = 55 Position = 1 to 225 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
15 ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
is 

20 

Seq. Id. = 55 Position = 28 to 252 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
25 GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 8 whose identifier is 
3649 controls the expression of the genes in this T1/T2 
30 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene YHR219W and has 
the DNA sequence 
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Seq. Id. = 56 Position = 1 to 252 



AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
5 ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence 
10 is 

Seq. Id. = 56 Position = 1 to 225 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
15 AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
20 is 

Seq. Id. = 56 Position = 28 to 252 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
25 GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 12 whose identifier is 
30 4764 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YLL066C and has 
the DNA sequence 
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Seq. Id. = 57 Position = 1 to 252 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
5 AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

10 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 57 Position = 1 to 225 

15 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

20 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 57 Position = 28 to 252 

25 ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

30 A C1/C2 short loop on chromosome 12 whose identifier is 
4751 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
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single strand that is 3 ' UTR to the gene YLL067C and has 
the DNA sequence 

Seq. Id. = 58 Position = 1 to 252 

5 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
10 ATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence 
is 

15 Seq. Id. = 58 Position = 1 to 225 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
20 AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
is 

25 Seq. Id. = 58 Position = 28 to 252 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
30 ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

A C1/C2 short loop on chromosome 13 whose identifier is 
5536 controls the expression of the genes in this T1/T2 
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long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YML133C and has 
the DNA sequence 

5 Seq. Id. = 59 Position = 1 to 252 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
1 0 AGAC AATCTATAAAAAGTAAAC ATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence 
is 

15 

Seq. Id. = 59 Position = 1 to 252 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
20 ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
is 

25 

Seq. Id. = 59 Position = 28 to 252 

TATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGT 
ATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAA 
30 AGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTA 
GCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 
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A C1/C2 short loop on chromosome 14 whose identifier is 
6102 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YNL339C and has 
5 the DNA sequence 

Seq. Id. = 60 Position = 1 to 252 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
10 AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

15 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 60 Position = 1 to 225 

20 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

25 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 60 Position = 28 to 252 

30 ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 
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A C1/C2 short loop on chromosome 16 whose identifier is 
8023 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
5 single strand that is 3 * UTR to the gene YPR204W and has 
the DNA sequence 

Seq. Id. = 61 Position = 1 to 252 

10 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

15 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 61 Position = 1 to 252 

20 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

25 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 61 Position = 28 to 252 

30 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
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GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 



A C1/C2 short loop on chromosome 16 whose identifier is 
5 7356 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YPL283C and has 
the DNA sequence 

10 Seq. Id. = 62 Position = 1 to 252 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
1 5 AGACAATCTATAAAAAGTAAAC ATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence 
is 

20 

Seq. Id. = 62 Position = 1 to 225 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
25 ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
is 

30 

Seq. Id. = 62 Position = 28 to 252 
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ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

5 

A C1/C2 short loop on chromosome 8 whose identifier is 
3293 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJSFA 
single strand that is 3 ' UTR to the gene YHL050C and has 
10 the DNA sequence 

Seq. Id. = 63 Position = 1 to 89 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
1 5 AGAAATTTTTTTTTCTAGGGAATATGCGTTTT 

The match between the Tl sequence and the C1/C2 sequence 
is 

20 Seq. Id. = 63 Position = 1 to 89 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTT 

25 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 63 Position = 28 to 89 

30 ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTT 
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A C1/C2 short loop on chromosome 8 whose identifier is 
3291 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YHL050C and has 
5 the DNA sequence 

Seq. Id. = 64 Position = 1 to 87 

ATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGA 
1 0 C AAGTGGG AAAGAGTAGG ATAAAAAG AC AA 

The match between the Tl sequence and the C1/C2 sequence 
is 

15 Seq. Id. = 64 Position = 1 to 87 

ATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGA 
CAAGTGGGAAAGAGTAGGATAAAAAGACAA 

20 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 64 Position = 1 to 87 

25 ATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGA 
CAAGTGGGAAAGAGTAGGATAAAAAGACAA 

A C1/C2 short loop on chromosome 2 whose identifier is 
145 controls the expression of the genes in this T1/T2 
30 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 f UTR to the gene YBL113C and has 
the DNA sequence 
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Seq. Id. = 65 Position = 1 to 73 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGG 
T AAG AG AC AAC AGG C T 

5 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 65 Position = 1 to 47 

10 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
is 

15 

Seq. Id. = 65 Position = 1 to 73 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGG 
T AAG AG AC AAC AGG C T 

20 

A C1/C2 short loop on chromosome 8 whose identifier is 
3289 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene YHL050C and has 
25 the DNA sequence 

Seq. Id. = 66 Position = 1 to 73 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGG 
30 TAAGAGACAACAGGCT 

The match between the Tl sequence and the C1/C2 sequence 
is 
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Seq. Id. = 66 Position = 1 to 47 

C T ATAAAAAGTAAAC ATAAAATAAAGGTAGT AAGTAGC TTTTGGTTG 

5 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 66 Position = 1 to 73 

10 

CTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGG 
TAAGAGACAACAGGCT 

A C1/C2 short loop on chromosome 2 whose identifier is 
15 146 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YBL113C and has 
the DNA sequence 

20 Seq. Id. = 67 Position = 1 to 62 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAA 

25 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 67 Position = 1 to 62 

30 AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAA 
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5 



25 



The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 67 Position = 28 to 62 
ATTATGTATTGTGTAGTATAGTATATTGTAAGAAA 



10 Example of a many-to-one connectron in multi-cell 
eukaryotes - C. elegans 

In this example the existence of the T1-T2 (3197-3308) 
long loop on chromosome 5 is controlled by three C1/C2 
15 short loops (4382, 4375 and 28633). 



43 82 Chromo s ome 1 
437 5 Chromo s ome 1 
20 2 8633 Chromosome 5 



| Chromosome 5 | 

28632 28697 



A double stranded DNA loop of length 58.451 kilo-bases on 
chromosome 5 is bounded on the left by a Tl sequence 
30 whose identifier is 28632. This Tl control element has 
the DNA sequence 

Seq. Id. = 68 Position = 1 to 86 

35 GCAAAAATTGACTGAAAATTTGAATTTCCCGCAAAAAATTGACTGAAAATTTGAATT 
TCCCGCCAAAAATTGACTGAAAATTTGAA 
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This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 28697. This T2 
control element has the DNA sequence 

5 Seq. Id. = 69 Position = 1 to 160 

CAAAAAATTGACTGAAAATTTGAATTTCCCTCCAAAAATTGACTGAAAATTTGAATT 
TCCCGCCAAAAATTGACTGAAAATTTGAATATCCCGCCAAAAATTGACTGAAAATTT 
GAATTTCCCGCCGAAAATTAAATGAAAAATGGAATTTCTCGCCGAA 

10 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

M162.8 M162.4 M162.3 M162.6 M162.2 

15 M162.1 M162.7 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 A C1/C2 short loop on chromosome 1 whose identifier is 
4382 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene Y43F8B.10 and has 
the DNA sequence 

25 

Seq. Id. = 70 Position = 1 to 319 

ATTATAGAAAATTTAAATTTCCCTCCAAAAAATTGACTGAAAATTTGAATTTCCCTC 
CAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATTGACTGAAAATTTGAATAT 
30 CCCGCCAAAAATTGACTGAAAATTTGAATTTCCCGCCGAAAATTAAATGAAAAATGG 
AATTTCTCGCCGAAAAATTCAGTAAAAATTTGAATTTCCTGCCAAAAATTGACTGAA 
AATTTGAATTTCTTGCCAAAAAAGTGACTGGGAATTTGAATTTCCCTCCAAAAATTG 
ACTGAAATTTTGAATTTCCCGCTAAAAGTTGACT 
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The match between the Tl sequence and the C1/C2 sequence 
is 

5 Seq. Id. = 70 Position = 58 to 88 
CAAAAATTGACTGAAAATTTGAATTTCCCGC 

The match between the T2 sequence and the C1/C2 sequence 
10 is 

Seq. Id. = 70 Position = 26 to 185 

CAAAAAATTGACTGAAAATTTGAATTTCCCTCCAAAAATTGACTGAAAATTTGAATT 
15 TCCCGCCAAAAATTGACTGAAAATTTGAATATCCCGCCAAAAATTGACTGAAAATTT 
GAATTTCCCGCCGAAAATTAAATGAAAAATGGAATTTCTCGCCGAA 

A C1/C2 short loop on chromosome 1 whose identifier is 
4375 controls the expression of the genes in this T1/T2 
20 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene Y43F8B.10 and has 
the DNA sequence 

Seq. Id. = 71 Position = 1 to 319 

25 

ATTATAGAAAATTTAAATTTCCCTCCAAAAAATTGACTGAAAATTTGAATTTCCCTC 
CAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATTGACTGAAAATTTGAATAT 
CCCGCCAAAAATTGACTGAAAATTTGAATTTCCCGCCGAAAATTAAATGAAAAATGG 
AATTTCTCGCCGAAAAATTCAGTAAAAATTTGAATTTCCTGCCAAAAATTGACTGAA 
30 AATTTGAATTTCTTGCCAAAAAAGTGACTGGGAATTTGAATTTCCCTCCAAAAATTG 
ACTGAAATTTTGAATTTCCCGCTAAAAGTTGACT 
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The match between the Tl sequence and the C1/C2 sequence 
is 



Seq. Id. = 71 Position = 58 to 88 

5 

CAAAAATTGACTGAAAATTTGAATTTCCCGC 

The match between the T2 sequence and the C1/C2 sequence 
is 

10 

Seq. Id. = 71 Position = 58 to 217 

CAAAAAATTGACTGAAAATTTGAATTTCCCTCCAAAAATTGACTGAAAATTTGAATT 
TCCCGCCAAAAATTGACTGAAAATTTGAATATCCCGCCAAAAATTGACTGAAAATTT 
15 GAATTTCCCGCCGAAAATTAAATGAAAAATGGAATTTCTCGCCGAA 

A C1/C2 short loop on chromosome 5 whose identifier is 
28633 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJSFA 
20 single strand that is 3 * UTR to the gene M162.5 and has 
the DNA sequence 

Seq. Id. = 72 Position = 1 to 85 

25 CAAAAATTGACTGAAAATTTGAATTTCCCGCAAAAAATTGACTGAAAATTTGAATTT 
CCCGCCAAAAATTGACTGAAAATTTGAA 

Seq. Id. = 72 Position = 1 to 85 

30 The match between the Tl sequence and the C1/C2 sequence 
is 
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CAAAAATTGACTGAAAATTTGAATTTCCCGCAAAAAATTGACTGAAAATTTGAATTT 
C C CGCC AAAAATTG AC TGAAAATTTG AA 

The match between the T2 sequence and the C1/C2 sequence 
5 is 

Seq. Id. = 72 Position = 31 to 60 
CAAAAAATTGACTGAAAATTTGAATTTCCC 

10 
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3* One connectron controls the expression of 
many sets of genes in prokaryotes, archea, 
single-celled eukaryotes and mult i -celled 
eukaryotes . 

5 

One C1/C2 short loop can control the existence of a many 
T1-T2 long loops. The C1/C2 short loop can be on the 
same chromosome or on different chromosomes from the Tl- 
T2 long loops. This relationship is described as tt one- 
10 to-many". This relationship exists in prokaryotes, 
archea, single-celled eukaryotes and multi-celled 
eukaryotes . 

Example of a one- to-many connectron in prokaryotes - E. 
15 coli 

In this example the existence of T1-T2 (3208-3315, 3436- 
3476, 3439-3478 and 3441-3479) long loops are controlled 
by one C1/C2 short loop (3206) . 



20 

32 06 Chromosome 1 

I 

* * * 

| Chromosome 1 | 

25 3208 3315 

32 06 Chromosome 1 
I 

30 * * * 

| Chromosome 1 | 

3436 3476 

35 32 06 Chromosome 1 

I 

★ * * 

I Chromosome 1 I 
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3439 



3478 



3206 Chr omo s ome 1 
I 

★ 

Chromosome 1 

3441 

10 

A double stranded DNA loop of length 93.377 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3208. This Tl control element has the 
15 DNA sequence 

Seq. Id. = 73 Position = 1 to 340 

ACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCAAGCTGA 
20 AAATTGAAACACTGAACAACGAAAGTTGTTCGTGAGTCTCTCAAATTTTCGCAACAC 
GATGATGAATCGAAAGAAACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACAC 
GGTGGATGCCCTGGC . . . AGTGTGTTTCGACACACTATCATTAACTGAATCCATAGG 
TTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCA 
ACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATCAG 
25 T 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3315. This T2 
control element has the DNA sequence 

30 

Seq. Id. = 74 Position = 1 to 330 

TTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAACGAAAGTT 
GTTCGTGAGTCTCTCAAATTTTCGCAACTCTGAAGTGAAACATCTTCGGGTTGTGAG 
35 GTTAAGCGACTAAGCGTACACGGTGGATGCCCTGGCAGTCAGAGGCGATGAAGGACG 
TGCTAATCTGCGATA. . . GGTTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGT 



5 



3479 
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ACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAG 
CAGCCCAGAGCCTGAATCAGTGTGTGTGTTAGTGGAAGCGTCTGGAAA 

This long T1/T2 double stranded DNA loop modulates the 
5 expression of the following genes 
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The expression of genes in this T1/T2 long loop is 
25 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
3206 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
30 single strand that is 3 ' UTR to the gene rrsC and has the 
DNA sequence 

Seq. Id. = 75 Position = 1 to 367 
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GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCG 
AATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCACCTGCCTTAATAT 
CTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCA 
5 AGCTGAAAATTGAAA . . . ACCGGCGATTTCCGAATGGGGAAACCCAGTGTGTTTCGA 
CACACTATCATTAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGAACTGAAACA 
TCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAA 
CGGGGAGCAGCCCAGAGCCTGAATCAGT 

10 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 75 Position = 121 to 367 

15 ACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCAAGCTGA 
AAATTGAAACACTGAACAACGAAAGTTGTTCGTGAGTCTCTCAAATTTTCGCAACAC 
GATGATGAATCGAAAGAAACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACAC 
GGTGGATGCCCTGGC . . . AGTGTGTTTCGACACACTATCATTAACTGAATCCATAGG 
TTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCA 

20 ACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATCAG 
T 

The match between the T2 sequence and the C1/C2 sequence 
is 

25 

Seq. Id. = 75 Position = 148 to 232 

TTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACACTGAACAACGAAAGTT 
GTTCGTGAGTCTCTCAAATTTTCGCAAC 

30 
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A double stranded DNA loop of length 41.279 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3436. This Tl control element has 
the DNA sequence 

Seq. Id. = 76 Position = 1 to 113 

ACGCAACGCGTGATAAGCAATTTTCGTGTCCCCTTCGTCTAGAGGCCCAGGACACCG 
CCCTTTCACGGCGGTAACAGGGGTTCGAATCCCCTAGGGGACGCCACTTGCTGGTT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3476. This T2 
control element has the DNA sequence 

Seq. Id. = 77 Position = 1 to 150 

AGTGAAAAGCAAGGCGTCTTGCGAAGCAGACTGATACGTCCCCTTCGTCTAGAGGCC 
CAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCGAATCCCCTAGGGGACGCCAC 
TTGC TGGTTTGTG AGTG AAAGTC AC C TGC C TT AAT A 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



gltT rrlB rrfB murB coaA 

b3975 tyrU thrT tufB secE 

nusG rplK rplA rplJ rplL 

rpoB rpoC htrC thiH thiF 

thiE yjaE yjaD hemE nf i 

yjaG hupA yjaH yjal hydH 

purD purH 



This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 
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A CI /C2 short loop on chromosome 1 whose identifier is 
3206 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
5 single strand that is 3 1 UTR to the gene rrsC and has the 
DNA sequence 

Seq. Id. = 78 Position = 1 to 553 

10 GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCG 
AATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCACCTGCCTTAATAT 
CTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCA 
AGCTGAAAATTGAAACACTGAACAACGAAAGTTGTTCGTGAGTCTCTCAAATTTTCG 
CAACACGATGATGAATCGAAAGAAACATCTTCGGGTTGTGAGGTTAAGCGACTAAGC 

15 GTACACGGTGGATGCCCTGGCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCGATA 
AGCGTCGGTAAGGTGATATGAACCGTTATAACCGGCGATTTCCGAATGGGGAAACCC 
AGTGTGTTTCGACACACTATCATTAACTGAATCCATAGGTTAATGAGGCGAACCGGG 
GGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTA 
GCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATCAGT 

20 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 78 Position = 1 to 86 

25 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCG 
AATCCCCTAGGGGACGCCACTTGCTGGTT 

The match between the T2 sequence and the C1/C2 sequence 
30 is 

Seq. Id. = 78 Position = 1 to 113 
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GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCG 
AATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCACCTGCCTTAATA 



A double stranded DNA loop of length 41.336 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3439. This Tl control element has 
the DNA sequence 

10 

Seq. Id. = 79 Position = 1 to 94 



CCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAA 
TCTGGATCAAGCTGAAAATTGAAACACTGAACAACGA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3478. This T2 
control element has the DNA sequence 



20 Seq. Id. = 80 Position = 1 to 94 



GTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACAC 
TGAACAACGAAAGTTGTTCGTGAGTCTCTCAAATTTT 



25 This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



rrlB 
tyrU 
30 rplK 
rpoC 
yjaE 



rrf B 
thrT 
rplA 
htrC 
yjaD 



murB 
tufB 
rplJ 
thiH 
hemE 



coaA 
secE 
rplL 
thiF 
nf i 



b3975 
nusG 
rpoB 
thiE 
yjaG 
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hupA yjaH yjal hydH purD 

purH gltV 

The expression of genes in this T1/T2 long loop is 
5 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
3206 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
10 single strand that is 3 ' UTR to the generrsC and has the 
DNA sequence 

Seq. Id. = 81 Position = 1 to 367 

15 GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCG 
AATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCACCTGCCTTAATAT 
C TC AAAAC TC ATC TTCGGGTG ATGTTTG AG AT ATTTGC TC TTT AAAAATC TGG ATC A 
AGCTGAAAATTGAAA . . . ACCGGCGATTTCCGAATGGGGAAACCCAGTGTGTTTCGA 
CACACTATCATTAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGAACTGAAACA 

20 TCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAA 
CGGGGAGCAGCCCAGAGCCTGAATCAGT 

The match between the Tl sequence and the C1/C2 sequence 
is 

25 

Seq. Id. = 81 Position = 106 to 199 

CCTTAATATCTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAA 
TCTGGATCAAGCTGAAAATTGAAACACTGAACAACGA 

30 

The match between the T2 sequence and the C1/C2 sequence 
is 
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Seq. Id. = 81 Position = 133 to 226 

GTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCAAGCTGAAAATTGAAACAC 
TGAACAACGAAAGTTGTTCGTGAGTCTCTCAAATTTT 

5 



A double stranded DNA loop of length 38.285 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
10 whose identifier is 3441. This Tl control element has 
the DNA sequence 

Seq. Id. = 82 Position = 1 to 355 

15 AATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGGGTTGTGAGGTTAAGC 
GACTAAGCGTACACGGTGGATGCCCTGGCAGTCAGAGGCGATGAAGGACGTGCTAAT 
CTGCGATAAGCGTCGGTAAGGTGATATGAACCGTTATAACCGGCGATTTCCGAATGG 
GGAAACCCAGTGTGT . . . GATGAGAGAAGATTTTCAGCCTGATACAGATTAAATCAG 
AACGCAGAAGCGGTCTGATAAAACAGAATTTGCCTGGCGGCAGTAGCGCGGTGGTCC 

20 CACCTGACCCCATGCCGAACTCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGG 
GGTCTCCCCATGCGAG 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3479. This T2 
25 control element has the DNA sequence 

Seq. Id. = 83 Position = 1 to 356 

AAGAAACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGATGCCCTG 
30 GCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCGATAAGCGTCGGTAAGGTGATAT 
GAACCGTTATAACCGGCGATTTCCGAATGGGGAAACCCAGTGTGTTTCGACACACTA 
TC ATTAAC TGAATC C . . . CAGATTAAATCAGAACGCAGAAGCGGTCTGATAAAACAG 
AATTTGCCTGGCGGCAGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAACTCAGAA 
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GTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAAC 
TGCCAGGCATCAAATTA 

This long T1/T2 double stranded DNA loop modulates the 
5 expression of the following genes 



rrlB rrfB murB coaA b3975 

tyrU thrT tufB secE nusG 

rplK rplA rplJ rplL rpoB 

10 rpoC htrC thiH thiF thiE 

yjaE yjaD hemE nfi yjaG 

hupA yjaH yjal hydH purD 

purH gltV 



15 The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
3206controls the expression of the genes in this T1/T2 
20 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene rrsC and has the 
DNA sequence 

Seq. Id. = 84 Position = 1 to 519 

25 

GTCCCCTTCGTCTAGAGGCCCAGGACACCGCCCTTTCACGGCGGTAACAGGGGTTCG 
AATCCCCTAGGGGACGCCACTTGCTGGTTTGTGAGTGAAAGTCACCTGCCTTAATAT 
CTCAAAACTCATCTTCGGGTGATGTTTGAGATATTTGCTCTTTAAAAATCTGGATCA 
AGCTGAAAATTGAAAAATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGG 
30 GTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGATGCCCTGGCAGTCAGAGGCGAT 
GAAGGACGTGCTAATCTGCGATAAGCGTCGGTAAGGTGATATGAACCGTTATAACCG 
GCGATTTCCGAATGGGGAAACCCAGTGTGTTTCGACACACTATCATTAACTGAATCC 
ATAGGTTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGA 
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AATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGA 
ATCAGT 

The match between the Tl sequence and the C1/C2 sequence 
5 is 

Seq. Id. = 84 Position = 187 to 519 

AATTTTCGCAACACGATGATGAATCGAAAGAAACATCTTCGGGTTGTGAGGTTAAGC 
10 GACTAAGCGTACACGGTGGATGCCCTGGCAGTCAGAGGCGATGAAGGACGTGCTAAT 
CTGCGATAAGCGTCGGTAAGGTGATATGAACCGTTATAACCGGCGATTTCCGAATGG 
GGAAACCCAGTGTGTTTCGACACACTATCATTAACTGAATCCATAGGTTAATGAGGC 
GAACCGGGGGAACTGAAACATCTAAGTACCCCGAGGAAAAGAAATCAACCGAGATTC 
CCCCAGTAGCGGCGAGCGAACGGGGAGCAGCCCAGAGCCTGAATCAGT 

15 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 84 Position = 214 to 519 

20 

AAGAAACATCTTCGGGTTGTGAGGTTAAGCGACTAAGCGTACACGGTGGATGCCCTG 
GCAGTCAGAGGCGATGAAGGACGTGCTAATCTGCGATAAGCGTCGGTAAGGTGATAT 
GAACCGTTATAACCGGCGATTTCCGAATGGGGAAACCCAGTGTGTTTCGACACACTA 
TCATTAACTGAATCCATAGGTTAATGAGGCGAACCGGGGGAACTGAAACATCTAAGT 
25 ACCCCGAGGAAAAGAAATCAACCGAGATTCCCCCAGTAGCGGCGAGCGAACGGGGAG 
CAGCCCAGAGCCTGAATCAGT 



30 Example of a one-to-many connectron in archea - M. 
jannaschii 
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In this example the existence of T1-T2 (534-611, 1139- 
1159, and 1630-1643) long loops are controlled by one 
C1/C2 short loop (1642). 



5 1642 Chromosome 1 

I 

* * * 

| Chromosome 1 | 

534 611 

10 

1642 Chromosome 1 

I 

* * * 

15 | Chromosome 1 | 

1139 1159 



20 



25 



1642 Chromosome 1 

I 

★ * * 

| Chromosome 1 | 

1630 1643 



A double stranded DNA loop of length 72.886 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 534. This Tl control element has the 
30 DNA s equenc e 



Seq. Id. = 85 Position = 1 to 37 



TAAGTAAATAAAATTTCTCTAACAAATAAGTTAAATT 

35 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 611. This T2 
control element has the DNA sequence 



40 Seq. Id. = 86 Position = 1 to 59 
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TAAATAAAATTTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATG 
CT 
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15 


MJ0526 


MJ0529 


MJ053 0 


MJ0531 


MJ0532 




MJ0534 


MJ0535 


MJ0536 


MJ053 8 


MJ0539 




MJ0540 


MJ0541 


MJ0542 


MJ0543 


MJ0544 




MJ0545 


MJ0547 


MJ0548 


MJ0549 


MJ0550 




MJ0552 


MJ0553 


MJ0554 


MJ0555 


MJ0556 


20 


MJ0558 


MJ0559 


MJ0560 


MJ0561 


MJ0562 




MJ0563 


MJ0564 









The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

25 

A C1/C2 short loop on chromosome 1 whose identifier is 
1642 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJSFA 
single strand that is 3 ' UTR to the gene MJ1602 and has 
30 the DNA sequence 

Seq. Id. = 87 Position = 1 to 177 
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ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAATTTCTCTAACA 
AATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCTGTTTTATTATGGAAAG 
AAAGAT 

5 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 87 Position = 92 to 127 

10 

AAGTAAATAAAATTTCTCTAACAAATAAGTTAAATT 

The match between the T2 sequence and the C1/C2 sequence 
is 

15 

Seq. Id. = 87 Position = 95 to 150 

TAAATAAAATTTCTCTAACAAATAAGTTAAATTTTTGGATTTAAAAAGATAAAAAT 

20 

A double stranded DNA loop of length 14.509 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 1139. This Tl control element has 
25 the DNA sequence 

Seq. Id. = 88 Position = 1 to 78 

ATTT ATT AATT AGTTC AAAGG ATTTTT ATTT AATTTC T AAGGGTT AGC TGGTTTG AT 
30 TGTTTAAAATATTTGAGTTTA 
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This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1159. This T2 
control element has the DNA sequence 



5 Seq. Id. = 89 Position = 1 to 78 



ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAATTA 



10 This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



MJ1096 MJ1097 tRNA-Arg-3 MJ1098 MJ1099 

MJ1100 MJ1101 MJ1102 MJ1103 MJ1104 

15 MJ1105 MJ1106 MJ1107 MJ1108 



The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 A C1/C2 short loop on chromosome 1 whose identifier is 
1642 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene MJ1602 and has 
the DNA sequence 

25 

Seq. Id. = 90 Position = 1 to 177 



ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAATTTCTCTAACA 
30 AAT AAGTT AAATTTTTGG ATTT AAAAAG AT AAAAAT AC TC TGTTTT ATT ATGG AAAG 
AAAGAT 
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The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 90 Position = 1 to 31 

5 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATT 

The match between the T2 sequence and the C1/C2 sequence 
is 

10 

Seq. Id. = 90 Position = 1 to 78 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAATTA 

15 



A double stranded DNA loop of length 4.998 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
20 whose identifier is 1630. This Tl control element has 
the DNA sequence 

Seq. Id. = 91 Position = 1 to 175 

25 TTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTA 
TTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAATTAAGATTAATTAG 
GAAAGGAAATAAGATTTCTCTAACAGACAAGTTAAATTTTTGGATTTAAAAAGATAA 
AAAT 

30 This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1643. This T2 
control element has the DNA sequence 
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Seq. Id. = 92 Position = 1 to 175 

TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTA 
TTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAATTTCTCTAACAAA 
5 TAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCTGTTTTATTATGGAAAGAA 
AGAT 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

10 

MJ1597 MJ1598 MJ1599 MJ1600 MJ1601 

MJ1602 

The expression of genes in this T1/T2 long loop is 
15 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
1642 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
20 single strand that is 3 ' UTR to the gene MJ1602 and has 
the DNA sequence 

Seq. Id. = 93 Position = 1 to 177 

25 ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAATTAGGATTAATTAGGCAAGTAAATAAAATTTCTCTAACA 
AATAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCTGTTTTATTATGGAAAG 
AAAGAT 

30 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 93 Position = 20 to 78 
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GCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTATTCAGATTTTTAAAAAT 
TA 

5 The match between the - T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 93 Position = 3 to 177 

10 TTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAATTA 
TTCAGATTTTTAA7VAATTAGGATTAATTAGGCAAGTAAATAAAATTTCTCTAACAAA 
TAAGTTAAATTTTTGGATTTAAAAAGATAAAAATACTCTGTTTTATTATGGAAAGAA 
AGAT 



15 



Example of a one-to-many connectron in single-cell 
eukaryotes - S. cervesiae 

20 In this example the existence of T1-T2 (158-171, 293- 
317, 4295-4308 and 5916-5923) long loops are controlled 
by one C1/C2 short loop (86) . 



25 



30 



86 Chromosome 1 

l 

* * * 

| Chromosome 1 | 

158 171 



86 Chromosome 1 



* * * 

| Chromosome 1 | 

35 293 317 
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86 Chromosome 1 
I 

5 * * * 

| Chromosome 10 | 

4295 4308 



10 86 Chromosome 1 

I 

★ * * 

| Chromosome 13 | 

5916 5923 

15 



A double stranded DNA loop of length 20.391 kilo-bases on 
chromosome 2 is bounded on the left by a Tl sequence 
20 whose identifier is 158. This Tl control element has the 
DNA sequence 

Seq. Id. = 94 Position = 1 to 153 

25 CCAATTGTTGGAATAAAAATCAACTATCATCTACTAACTAGTATTTACGTTACTAGT 
ATATTATCATATACGGTGTTAGAAGATGACGCAAATGATGAGAAATAGTCATCTAAA 
TTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAG 

This double stranded DNA loop is bounded on the right by 
30 a T2 control element whose identifier is 171. This T2 
control element has the DNA sequence 

Seq. Id. = 95 Position = 1 to 192 

35 ATAATTGTTGGAATAAAAATCAACTATCATCTACTAACTAGTATTTACGTTACTAGT 
ATATTATCATATACGGTGTTAGAAGATGACACAAATGATGAGAAATAGTCATCTAAA 
TTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAGGATCAATGAATATTAACA 
TATAAAATGATGATAATAATA 
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This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

5 YBL107W-A TL (UAA) Bl YBL107C YBL106C YBL105C 

YBL104C YBL103C YBL102W YBL101C 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

10 

A C1/C2 short loop on chromosome 1 whose identifier is 86 
controls the expression of the genes in this T1/T2 long 
loop. This C1/C2 short loop is expressed as a RNA single 
strand that is 3'UTR to the gene YAR009C and has the DNA 
15 sequence 

Seq. Id. = 96 Position = 1 to 362 

ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTACTAACTA 
20 GTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAATGATG 
AGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAGG 
ATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTA 
GAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATAT 
TCTGTATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAA 
25 CATTCACCCATTTCTCAGAA 

The match between the Tl sequence and the C1/C2 sequence 
is 

30 Seq. Id. = 96 Position = 34 to 65 
AAATCAACTATCATCTACTAACTAGTATTTAC 
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The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 96 Position = 34 to 65 

5 

AAATC AAC T ATC ATC T AC T AAC T AGT ATTT AC 



10 A double stranded DNA loop of length 38.470 kilo-bases on 
chromosome 2 is bounded on the left by a Tl sequence 
whose identifier is 293. This Tl control element has the 
DNA sequence 

15 Seq. Id. = 97 Position = 1 to 258 

GAATTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAATA 
TATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTGTCATCGAAG 
TTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAGGATAATGAAACATATAAA 
20 ACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 
ATTCCTATATCCTTGAGGAGAACTTCTAGT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 317. This T2 
25 control element has the DNA sequence 

Seq. Id. = 98 Position = 1 to 77 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAG 
30 AACTTCTAGTATATTCTGTA 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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YBL005W-B 
YBL001C 
YBR005W 
5 YBROIOW 



TS (AG A) B 
YBROOIC 
YBR006W 
YBR011C 



YBL004W 
YBR002C 
YBR007C 
YBR012C 



YBL003C 
YBR0 03W 
YBR0 08C 



YBL002W 
YBR004C 
YBR009C 



The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

10 A C1/C2 short loop on chromosome 1 whose identifier is 86 
controls the expression of the genes in this T1/T2 long 
loop. This C1/C2 short loop is expressed as a RNA single 
strand that is 3 ' UTR to the gene YAR009C and has the DNA 
sequence 

15 

Seq. Id. = 99 Position = 1 to 362 



ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTACTAACTA 
GTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAATGATG 
20 AGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAGG 
ATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTA 
GAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATAT 
TCTGTATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAA 
CATTCACCCATTTCTCAGAA 

25 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 99 Position = 181 to 264 

30 

AAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATT 
CCATTTTGAGGATTCCTATATCCT 
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The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 99 Position = 215 to 291 

5 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAG 
AAC TTC TAGT ATATTC TGTA 



10 

A double stranded DNA loop of length 11.020 kilo-bases on 
chromosome 10 is bounded on the left by a Tl sequence 
whose identifier is 4295. This Tl control element has 
the DNA sequence 

15 

Seq. Id. = 100 Position = 1 to 145 

AAACGCAAGGATTGATAATGTAATAGGATCAATGAATATAAACATATAAAACGGAAT 
GAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTA 
20 TATCCTCGAGGAGAACTTCTAGTATATTCTG 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 4308. This T2 
control element has the DNA sequence 

25 

Seq. Id. = 101 Position = 1 to 180 

GGAAGCTGAAACGCAAGGATTGATAATGTAATAGGATCAATGAATATAAACATATAA 
AACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAG 
30 GATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAATATTATAGC 
CTTTATCAA 
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This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

YJR027W YJR029W 

5 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 87 
10 controls the expression of the genes in this T1/T2 long 
loop. This C1/C2 short loop is expressed as a RNA single 
strand that is 3 ' UTR to the gene YAR009C and has the DNA 
sequence 

15 Seq. Id. = 102 Position = 1 to 359 

ATC T ATT AC ATT ATGGGTGGT ATGTTGG AAT AG AAATC AAC T ATC ATC T AC T AAC T A 
GTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAATGATG 
AGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAGG 
20 ATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTA 
GAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATAT 
TCTGTATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAA 
CATTCACCCATTTCTCA 

25 

A double stranded DNA loop of length 5.462 kilo-bases on 
chromosome 13 is bounded on the left by a Tl sequence 
whose identifier is 5916. This Tl control element has 
30 the DNA sequence 

Seq. Id. = 103 Position = 1 to 146 
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AAGCTGAAGTGCAAGGATTGATAATGTAATAGGATAATGAAACATATAAAACGGAAT 
G AGG AAT AATC GT AAT ATT AGT ATGT AG AAAT AT AG ATTC C ATTTTG AGG ATTC C T A 
TATCCTCGAGGAGAACTTCTAGTATATTCTGTA 

5 This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 5923. This T2 
control element has the DNA sequence 

Seq. Id. = 104 Position = 1 to 146 

10 

1 0 4 T AATAGGATAATGAAAC ATATAAAACGGAATG AGG AATAATCGTAATATT AGT A 
TGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGT 
AT ATTC TGT AT AC C T AAT ATT AT AGC C TTT ATC AA 

15 This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

YML045W 

20 The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 87 
controls the expression of the genes in this T1/T2 long 
25 loop. This C1/C2 short loop is expressed as a RNA single 
strand that is 3 ' UTR to the gene YAR009C and has the DNA 
sequence 

Seq. Id. = 105 Position = 1 to 359 

30 

ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTACTAACTA 
GTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAATGATG 
AGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAGG 
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ATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTA 
GAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATAT 
TCTGTATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAA 
CATTCACCCATTTCTCA 



Example of a one-to-many connectron in multi-cell 
eukaryotes - C. elegans 

10 

In this example the existence of T1-T2 (16554-16661 and 
21565-21590) long loops are controlled by one C1/C2 short 
loop (21591) . 

15 



21591 Chr omo s ome 5 

I 

20 * * * 

| Chromosome 4 | 

16554 16661 



25 21591 Chr omo s ome 5 



| Chromosome 5 | 

21565 21590 

30 



A double stranded DNA loop of length 50.159 kilo-bases on 
35 chromosome 4 is bounded on the left by a Tl sequence 
whose identifier is 16554. This Tl control element has 
the DNA sequence 
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Seq. Id. = 106 Position = 1 to 143 



TGCCTGAAAAAATTGGCTCCGAGTTAGGACACTTGGGGTGGTCAAAAAATTTTGTGA 
CTATTGTCAAATGAAAGATCATAGTTGATAACATAAATTCCCAAAGTTTCATAAAAA 
5 TCGATACGCAGCGAACAAAGTTATCAATT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 16661. This T2 
control element has the DNA sequence 

10 

Seq. Id. = 107 Position = 1 to 141 

CACTTGGGGTGGTCAAAAAATTTTGTGATTATTGTCAAATGAAAGATCATGGTTGAT 
AACATAAATTCCCAAAGTTTCATAAAAATCGATACGCAGCGAACAAAGTTATGATTT 
15 TTG AC C C GG AAC TT ATTTGG AG AC C T A 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

20 C23H5.7 C23H5.8a C23H5.3 C23H5.2 C23H5.9 

C23H5.1 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

25 

A C1/C2 short loop on chromosome 5 whose identifier is 
21591 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene F25A2.1 and has 
30 the DNA sequence 

Seq. Id. = 108 Position = 1 to 117 



- 127- 



TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCATAAAAAT 
CGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTATTTGGAGACCTAAT 
ATT 

5 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 108 Position = 46 to 85 

1 0 TTTC ATAAAAATCGATACGC AGCGAAC AAAGTTAT 

The match between the T2 sequence and the C1/C2 sequence 
is 

15 Seq. Id. = 108 Position = 1 to 42 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCA 



20 

A double stranded DNA loop of length 18.142 kilo-bases on 
chromosome 5 is bounded on the left by a Tl sequence 
whose identifier is 21565. This Tl control element has 
the DNA sequence 

25 

Seq. Id. = 109 Position = 1 to 72 

CTCCGAGTTAGGACACTTGGGGTGGACAAAAAATTTTGTGACTATTGTCAAATGAAA 
GATCATGGTTGATAA 

30 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 21590. This T2 
control element has the DNA sequence 
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Seq. Id. = 110 Position = 1 to 115 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCATAAAAAT 

CGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTATTTGGAGACCTAAT 
A 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

T21H3.2 T21H3.1 F25A2.1 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 
21591 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene F25A2.1 and has 
the DNA sequence 

Seq. Id. = 111 Position = 1 to 117 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCATAAAAAT 

CGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTATTTGGAGACCTAAT 
ATT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 111 Position = 1 to 30 
TATTGTCAAATGAAAGATCATGGTTGATAA 
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The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 111 Position = 1 to 115 

TATTGTCAAATGAAAGATCATGGTTGATAACATAAATTCCCACAATTTCATAAAAAT 

CGATACGCAGCGAACAAAGTTATGATTTTTGACCCGGAACTTATTTGGAGACCTAAT 
A 
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4. Connectrons occur between prokaryotes and 
their plasmids. 

Connectron relationships exist between prokaryotes and 
5 their plasmids. These connectrons implement a control 
mechanism between the two genomes that makes it possible 
for them to form a symbiotic relationship. In the case 
of D. radiodurans the relationship is not symmetric. The 
D. radiodurans genome sends C1/C2 short loops to the MPl 
10 plasmid. 

Example of a prokaryote /plasmid connectron - D. 
radiodurans 

15 In this example the existence of T1-T2 (2654-2694 and 
2692-2749) long loops in chromosome 3 that is the plasmid 
MPl are controlled by one C1/C2 short loop (16) in 
chromosome 1 . 

20 16 Chromosome 1 



27 68 Chromosome 3 (plasmid MPl) 
2 653 Chromosome 3 (plasmid MPl) 



25 



Chromosome 3 (plasmid MPl) 



2654 



2694 



2693 



30 



16 Chromosome 1 

2768 Chromosome 3 (plasmid MPl) 

2 693 Chromosome 3 (plasmid MPl) 



35 



Chromosome 3 (plasmid MPl) 



2692 



2749 



2693 



2695 
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A double stranded DNA loop of length 46.903 kilo-bases on 
chromosome 3 (plasmid MPl) is bounded on the left by a Tl 
sequence whose identifier is 2654. This Tl control 
5 element has the DNA sequence 



Seq. Id. = 112 Position = 1 to 274 



CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGC 
10 AGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCAT 
TCCGTGGGGCGCGTTACACCAGGCGACTGTCAGTACAGCAATCGAGAGTGGGCTGAT 
CAGCCCACTGTGCGTTCTGGCCATCGACGCCTCTTTTCACCGCAAAGCCGGTCAGCA 
CACCGCACACCTCGGCTCGTTCTGGAATGGCTGTGCCGCGCGGACC 

15 This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 2694. This T2 
control element has the DNA sequence 

Seq. Id. = 113 Position = 1 to 274 

20 

GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGATTCGTCG 
TTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGAC 
TGTCAGTACAGCAATCGAGAGTGGGCTGATCAGCCCACTGTGCGTTCTGGCCATCGA 
CGCCTCTTTTCACCGCAAAGCCGGTCAGCACACCGCACACCTCGGCTCGTTCTGGAA 
25 TGGCTGTGCCGCGCGGACCGAACGCGGAATCGAGCAATCCTGTTGT 



This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



30 



DRB002 0 
DRB0025 
DRB0034 
DRB0041 



DRB0021 
DRB0027 
DRB003 5 
DRB0042 



DRB002 2 
DRB003 0 
DRB00 3 7 
DRB0043 



DRB0023 
DRB0032 
DRB003 8 
DRB0044 



DRB0024 
DRB003 3 
DRB003 9 
DRB0045 
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DRB0047 DRB0051 DRB0052 DRB0054 DRB0055 

DRB0057 

This long T1/T2 double stranded DNA loop modulates the 

5 expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 3 (plasmid MP1) whose 
identifier is 2693 controls the expression of the genes 
of one or more other T1/T2 long loops. This C1/C2 short 
10 loop is expressed as a RNA single strand that is 3'UTR to 
the gene DRB0057 and has the DNA sequence 

Seq. Id. = 114 Position = 1 to 103 

15 CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGG 
AATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGGAC 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 

A C1/C2 short loop on chromosome 1 whose identifier is 16 
controls the expression of the genes in this T1/T2 long 
loop. This C1/C2 short loop is expressed as a RNA single 
strand that is 3'UTR to the gene DR0009 and has the DNA 
25 sequence 

Seq. Id. = 115 Position = 1 to 186 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTTCTCAGC 
30 GCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTT 
CTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCG 
GAGAGTACGATTCGT 
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The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 115 Position = 105 to 186 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGC 
AGCCTGCTCGGAGAGTACGATTCGT 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 115 Position = 132 to 186 

GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGATTCGT 

A C1/C2 short loop on chromosome 3 (plasmid MP1) whose 
identifier is 2768 controls the expression of the genes 
in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene DRB0133 and has the DNA sequence 

Seq. Id. = 116 Position = 1 to 186 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTTCTCAGC 

GCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTT 

CTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCG 
GAGAGTACGATTCGT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 116 Position = 105 to 186 
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CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGC 
AGCCTGCTCGGAGAGTACGATTCGT 

The match between the T2 sequence and the C1/C2 sequence 
5 is 

Seq. Id. = 116 Position = 132 to 186 

GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGATTCGT 

10 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose 
identifier is 2653 controls the expression of the genes 
in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
15 gene DRB0017 and has the DNA sequence 

Seq. Id. = 117 Position = 1 to 186 

CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTC 
20 TCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGG 
AGAGTACGATTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCG 
CGTTACACCAGGCGA 

The match between the Tl sequence and the C1/C2 sequence 
25 is 

Seq. Id. = 117 Position = 47 to 186 

CAGCGTTTTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGC 
30 AGCCTGCTCGGAGAGTACGATTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCAT 
TCCGTGGGGCGCGTTACACCAGGCGA 
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The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 117 Position = 74 to 186 

5 

GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGATTCGTCG 
TTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGA 



10 

A double stranded DNA loop of length 68.612 kilo-bases on 
chromosome 3 (plasmid MP1) is bounded on the left by a Tl 
sequence whose identifier is 2692. This Tl control 
element has the DNA sequence 

15 

Seq. Id. = 118 Position = 1 to 103 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGG 
AATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGGAC 

20 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 2749. This T2 
control element has the DNA sequence 

25 Seq. Id. = 119 Position = 1 to 103 

AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTT 
TTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGT 

30 This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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10 



DRB005 9 


DRB0060 


DRB0061 


DRB0062 


DRB0064 


DRB0 065 


DRB0066 


DRB0067 


DRB00 68 


DRB0069 


DRB0070 


DRB0072 


DRB007 3 


DRB0074 


DRB007 6 


DRB0077 


DRB0079 


DRB0080 


DRB0081 


DRB0083 


DRB0085 


DRB008 6 


DRB0087 


DRB0088 


DRB0089 


DRB009 0 


DRB0092 


DRB0093 


DRB0094 


DRB0096 


DRB0097 


DRB0098 


DRB0102 


DRB0103 


DRB0104 


DRB0105 


DRB0106 




DRB0107 


DRB0111 


DRB0112 










This long 


T1/T2 double 


stranded 


DNA loop modulates the 



expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 3 (plasmid MP1) whose 
15 identifier is 2693 controls the expression of the genes 
of one or more other T1/T2 long loops . This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to 
the gene DRB0057 and has the DNA sequence 

20 Seq. Id. = 120 Position = 1 to 103 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGG 
AATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGGAC 

25 A C1/C2 short loop on chromosome 3 (plasmid MP1) whose 
identifier is 2695 controls the expression of the genes 
of one or more other T1/T2 long loops . This C1/C2 short 
loop is expressed as a RNA single strand that is 3'UTR to 
the gene DRB0057 and has the DNA sequence 

30 

Seq. Id. = 121 Position = 1 to 274 
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0 



GCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGGAGAGTACGATTCGTCG 
TTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCGCGTTACACCAGGCGAC 
TGTCAGTACAGCAATCGAGAGTGGGCTGATCAGCCCACTGTGCGTTCTGGCCATCGA 
CGCCTCTTTTCACCGCAAAGCCGGTCAGCACACCGCACACCTCGGCTCGTTCTGGAA 
5 TGGCTGTGCCGCGCGGACCGAACGCGGAATCGAGCAATCCTGTTGT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

10 A C1/C2 short loop on chromosome 1 whose identifier is 16 
controls the expression of the genes in this T1/T2 long 
loop. This C1/C2 short loop is expressed as a RNA single 
strand that is 3 ' UTR to the gene DR0009 and has the DNA 
sequence 

15 

Seq. Id. = 122 Position = 1 to 186 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTTCTCAGC 
GCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTT 
20 CTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCG 
GAGAGTACGATTCGT 

The match between the Tl sequence and the C1/C2 sequence 
is 

25 

Seq. Id. = 122 Position = 28 to 130 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGG 
AATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGGAC 

30 

The match between the T2 sequence and the C1/C2 sequence 
is 
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Seq. Id. = 122 Position = 55 to 157 

AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTT 
TTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGT 

5 

A C1/C2 short loop on chromosome 3 (plasmid MP1) whose 
identifier is 2768 controls the expression of the genes 
in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the 
10 gene DRB0133 and has the DNA sequence 

Seq. Id. = 123 Position = 1 to 309 

GCTGTGAAATCACCGCTTCCAATGGGTCTGATGGCCATCCTACAGTACGTTCTCAGC 
15 GCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTT 
CTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCG 
GAGAGTACGATTCGTCGGACCGAACGCGGAATCGAGCAATCCTGTTGTGCCCTCATT 
GATGTCCAGCACCGGCAGGCCTTGACGGTCGATGTCCGTCAGACCCTGACCGGGTCT 
GAGGCTCCAACTCGTCTGGAACAG 

20 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 123 -Position = 28 to 130 

25 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGG 
AATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGGAC 

The match between the T2 sequence and the C1/C2 sequence 
30 is 

Seq. Id. = 123 Position = 55 to 107 
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AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTT 
TTTCTCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGT 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose 
5 identifier is 2693 controls the expression of the genes 
in this T1/T2 long loop. This C1/C2 short loop is 
expressed as a RNA single strand that is 3'UTR to the 
gene DRB0057 and has the DNA sequence 

10 Seq. Id. = 124 Position = 1 to 103 

CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGG 
AATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGGAC 

15 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 124 Position = 1 to 103 

20 CTGATGGCCATCCTACAGTACGTTCTCAGCGCGGTCCCGCTGCGCAAGACGCAGCGG 
AATTTCCTGACCGTGCTGCTCAGCGTTTTTCTCGCTGTTCCTGGAC 

The match between the T2 sequence and the C1/C2 sequence 
is 

25 

Seq. Id. = 124 Position = 28 to 103 

AGCGCGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTT 
TTTCTCGCTGTTCCTGGAC 

30 

A C1/C2 short loop on chromosome 3 (plasmid MPl) whose 
identifier is 2653 controls the expression of the genes 
in this T1/T2 long loop. This C1/C2 short loop is 
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expressed as a RNA single strand that is 3 ' UTR to the 
gene DRB0017 and has the DNA sequence 

Seq. Id. = 125 Position = 1 to 186 

5 

CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTC 

TCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGTATGCAGCCTGCTCGG 

AGAGTACGATTCGTCGTTGGCTGCACCGAAGTGACGATGGGGCCATTCCGTGGGGCG 
CGTTACACCAGGCGA 

10 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 125 Position = 1 to 172 

15 

CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTC 
TCGCTGTTCCTGGAC 

The match between the T2 sequence and the C1/C2 sequence 
20 is 

Seq. Id. = 125 Position = 1 to 99 

CGGTCCCGCTGCGCAAGACGCAGCGGAATTTCCTGACCGTGCTGCTCAGCGTTTTTC 
25 TCGCTGTTCCTGGACGGCTGAACGCCCTGAATCTCTCCCGGT 
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5. Connectrons occur in plants and higher 
animals 

Connectron relationships exist in plant and higher 
5 animals . 

Example of a plant connectron - A. thaliania 

In this example the existence of the T1-T2 (423-469) 
10 long loop is controlled by six C1/C2 short loops (972, 
21396, 422, 21762, 21813 and 10882). The T1-T2 long loop 
controls the expression of six genes on chromosome 2 in 
addition to two C1/C2 (42 6 and 430) short loops. 



15 972 Chromosome 2 

21396 Chromosome 4 

422 Chr omo s ome 2 

217 62 Chromosome 4 

21813 Chromosome 4 

20 10882 Chromosome 4 



| Chr omo s ome 2 | 

423 469 
25 | 426 430 | 



A double stranded DNA loop of length 42.285 kilo-bases on 
30 chromosome 2 is bounded on the left by a Tl sequence 
whose identifier is 423. This Tl control element has the 
DNA sequence 

Seq. Id. = 126 Position = 1 to 67 

35 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAA 
AAACGAAATA 
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This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 469. This T2 
control element has the DNA sequence 

Seq. Id. = 127 Position = 1 to 67 

TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTAATTTTCAAA 
AATAATAACC 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

At2g02070 At2g02080 At2g02090 At2g02100 At2g02120 
At2g02130 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 2 whose identifier is 
42 6 controls the expression of the genes of one or more 
other T1/T2 long loops . This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene At2g02060 and has the DNA sequence 

Seq. Id. = 128 Position = 1 to 55 

TTCCAAAAATAATAACCAATCAAAATCAACATATAAGATTTGATATCTAAATTTT 

A C1/C2 short loop on chromosome 2 whose identifier is 
430 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
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expressed as a RNA single strand that is 3 1 UTR to the 
gene At2g02 060 and has the DNA sequence 

Seq. Id. = 129 Position = 1 to 55 

TTGCGGAAAAATAATATCATCATTATAAAAAAATAATTAGAGTTTTTTCGCATAT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 2 whose identifier is 
972 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene At2g04240 and has 
the DNA sequence 

Seq. Id. = 130 Position = 1 to 118 

GTATGCCATTAGAAATAAAATTTTAAAAGTAAATTAATTCATCTCTTTAAAAATTAA 

AAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTA 
ATTT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 130 Position = 53 to 106 

ATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 130 Position = 167 to 118 
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TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTAATTT 

A C1/C2 short loop on chromosome 4 whose identifier is 
21396 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene AT4gl5300 and has 
the DNA sequence 

Seq. Id. = 131 Position = 1 to 122 

TGCCATTAGAAATAAAATTTTAAAGAGTAAATTAATTTATCTCTTTAAGGATTAAAA 

AGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTAAT 
TTCCAAAA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 131 Position = 38 to 104 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAA 
AAACGAAATA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 131 Position = 65 to 116 

TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATTATTAATTT 

A C1/C2 short loop on chromosome 2 whose identifier is 
422 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 



- 145- 



single strand that is 3 ' UTR to the gene At2g02060 and has 
the DNA sequence 

Seq. Id. = 132 Position = 1 to 137 

TAACCTTAATTTTTGTAAGTAATTATATAGGTATGCCATTAGAAATAAAATTTTAAA 
GAGTAAATTAATTTATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATT 
AAATTTAATTAAAAAACGAAATA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 132 Position = 71 to 137 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAA 
AAACGAAATA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 132 Position = 98 to 137 
TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATA 

A C1/C2 short loop on chromosome 4 whose identifier is 
21762 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene AT4gl7510 and has 
the DNA sequence 

Seq. Id. = 133 Position = 1 to 65 
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TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGA 
AATACATT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 133 Position = 1 to 61 

TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGA 
AATA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 133 Position = 22 to 65 
TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATT 

A C1/C2 short loop on chromosome 4 whose identifier is 
21813 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene AT4gl7680 and has 
the DNA sequence 

Seq. Id. = 134 Position = 1 to 65 

TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGA 
AATACATT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 134 Position = 1 to 61 
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TTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAAAAAACGA 
AATA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 134 Position = 22 to 65 
TACTAATTTAATTAATTAAATTTAATTAAAAAACGAAATACATT 

A C1/C2 short loop on chromosome 2 whose identifier is 
10882 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene At2g26540 and has 
the DNA sequence 

Seq. Id. = 135 Position = 1 to 56 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 135 Position = 1 to 56 

TATCTCTTTAAGGATTAAAAAGTCAAATACTAATTTAATTAATTAAATTTAATTAA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 135 Position = 28 to 56 
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TACTAATTTAATTAATTAAATTTAATTAA 



Example of a animal connectron - D. megalomaster 

A double stranded DNA loop of length 88.159 kilo-bases on 
chromosome 4 is bounded on the left by a Tl sequence 
whose identifier is 3340. This Tl control element has 
the DNA sequence 

Seq. Id. = 136 Position = 1 to 132 

ACCTAAAAGAAGTACCGTTTTTTACTCCTAATTACCAATTCTAACCATCCATATCAC 

TTTTTGACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCATTTTTTGTAAGG 
GGTAACATCATAAAAATT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3372. This T2 
control element has the DNA sequence 

Seq. Id. = 137 Position = 1 to 136 

AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCACTTTTT 
GACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCATTTTTTGTAAGGGGTAA 
CATCATCAAAATTTGCGAAAAA 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

[Some of the following gene names have not been 
determined. ] 
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CG11207 - CG2186 CG2157 

Orkl - 



This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 4 whose identifier is 
3362 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene XXX and has the DNA sequence 

Seq. Id. = 138 Position = 1 to 134 

AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCACTTTTT 
GACGGACTCCGTTAAAATAATTTTTGACCAAATTTTCGCATTTTTTGTAATCAAAAT 
TTGCAAAAAATTGAAAAAAC 

A C1/C2 short loop on chromosome 4 whose identifier is 
3364 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene XXX and has the DNA sequence 

Seq. Id. = 139 Position = 1 to 83 

CAAAATTTGAATGCAAATCGATTGGGAATCAAAAAACAAACTCAACGAGGTATGACA 
TTCCATATTTGGGCCATTATTTCCAA 

A C1/C2 short loop on chromosome 4 whose identifier is 
3366 controls the expression of the genes of one or more 
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other T1/T2 long loops . This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene XXX and has the DNA sequence 

Seq. Id. = 140 Position = 1 to 62 

TTTTTTCACAAAAATTAGGAAAATGATTTTGGGTAAAAAAATGAATATTTAAGTTGG 
GTTTT 

A C1/C2 short loop on chromosome 4 whose identifier is 
33 69 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene XXX and has the DNA sequence 

Seq. Id. = 141 Position = 1 to 87 

AAATCGATTGGGAATCAAAAAACAAACCTCAACGAGGTATGACATTCCATATCTGGG 
CCATTATTTCCAATCTTTTGATCAAAATAC 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 
3373 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene XXX and has the 
DNA sequence 

Seq. Id. = 142 Position = 1 to 136 
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AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCACTTTTT 
GACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCATTTTTTGTAAGGGGTAA 
CATCATCAAAATTTGCGAAAAA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 142 Position = 15 to 120 

TTTTACTCCTAATTACCAATTCTAACCATCCATATCACTTTTTGACGGACTCCGTGA 
AAATAATTTTTGGCCAAATTTTCGCATTTTTTGTAAGGGGTAACATCAT 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 142 Position = 1 to 136 

AAAAAAGTACCGCGTTTTACTCCTAATTACCAATTCTAACCATCCATATCACTTTTT 
GACGGACTCCGTGAAAATAATTTTTGGCCAAATTTTCGCATTTTTTGTAAGGGGTAA 
CATCATCAAAATTTGCGAAAAA 



Example of an animal connectron - H. sapiens 

All of the human genome that has been fully sequenced by 
both the NIH-lead global sequencing project and the 
Celera Genomics, Inc. project. The gene descriptors for 
this chromosome do not yet exist. Without the positions 
and directions of the genes, it is not possible to select 
from among the possible connectrons to determine the real 
connectrons . 
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Human chromosome 22 has been processed and there 31,00 
possible connectrons . 

The gene descriptors for all the chromosomes of the human 
genome should become available within the year. 
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6. Permanent connectrons exist in prokaryotes, 
archea, single-celled eukaryotes and multi-celled 
eukaryotes . 

5 

C1/C2 short loops are normally expressed as the 3'UTR of 
some gene. A class of connectron relationships exist 
that permit one C1/C2 short loop to control the existence 
of one or more T1-T2 long loops without being subject to 
10 any expression controls other than those of the gene to 
which the C1/C2 is 3'UTR. These connectron relationships 
are described as "permanent". Permanent connectrons 
exist in prokaryotes, archea, single-celled eukaryotes 
and multi-celled eukaryotes. 

15 

Example of a prokaryote permanent connectron - E. coli 



In this example the existence of the T1-T2 (3200-3210) 
long loop is controlled by a C1/C2 short loop (3432). 
20 The expression of this C1/C2 short loop is controlled 
only by the gene btuB. 



3432 Chromosome 1 
I 

25 * * * 

| Chromosome 1 | 

3200 3210 



A double stranded DNA loop of length 93.339 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3200. This Tl control element has 
the DNA sequence 
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Seq. Id. = 143 Position = 1 to 378 

AAGCGGCACTGCTCTTTAACAATTTATCAGACAATCTGTGTGGGCACTCGAAGATAC 
GGATTCTTAACGTCGCAAGACGAAAAATGAATACCAAGTCTCAAGAGTGAACACGTA 
5 ATTCATTACGAAGTTTAATTCTTTGAGCATCAAACTTTTAAATTGAAGAGTTTGATC 
ATGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACAG 
GAAACAGCTTGCTGTTTCGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAAC 
TGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGCA 
AGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCATC 

10 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3310. This T2 
control element has the DNA sequence 

15 Seq. Id. = 144 Position = 1 to 378 

CAGACAATCTGTGTGGGCACTCGAAGATACGGATTCTTAACGTCGCAAGACGAAAAA 
TGAATACCAAGTCTCAAGAGTGAACACGTAATTCATTACGAAGTTTAATTCTTTGAG 
CGTCAAACTTTTAAATTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAG 
20 GCCTAACACATGCAAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTGACGAG 
TGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTACTGG 
AAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCT 
CTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGT 

25 This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
3432 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene btuB and has the 
DNA sequence 

Seq. Id. = 145 Position = 1 to 520 

AAGCGGCACTGCTCTTTAACAATTTATCAGACAATCTGTGTGGGCACTCGAAGATAC 
GGATTCTTAACGTCGCAAGACGAAAAATGAATACCAAGTCTCAAGAGTGAACACGTA 
ATTCATTACGAAGTTTAATTCTTTGAGCCAGACAATCTGTGTGGGCACTCGAAGATA 
CGGATTCTTAACGTCGCAAGACGAAAAATGAATACCAAGTCTCAAGAGTGAACACGT 
AATTCATTACGAAGTTTAATTCTTTGAGCGTCAAACTTTTAAATTGAAGAGTTTGAT 
CATGGCTCAGATTGAACGCTGGCGGCAGGCCTAACACATGCAAGTCGAACGGTAACA 
GGAAGAAGCTTGCTTCTTTGCTGACGAGTGGCGGACGGGTGAGTAATGTCTGGGAAA 
CTGCCTGATGGAGGGGGATAACTACTGGAAACGGTAGCTAATACCGCATAACGTCGC 
AAGACCAAAGAGGGGGACCTTCGGGCCTCTTGCCATCGGATGTGCCCAGATGGGATT 
AGCTAGT 
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The match between the Tl sequence and the C1/C2 sequence 
is 

5 Seq. Id. = 145 Position = 1 to 142 

AAGCGGCACTGCTCTTTAACAATTTATCAGACAATCTGTGTGGGCACTCGAAGATAC 
GGATTCTTAACGTCGCAAGACGAAAAATGAATACCAAGTCTCAAGAGTGAACACGTA 
ATTCATTACGAAGTTTAATTCTTTGAGC 

10 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 145 Position = 143 to 520 

15 

CAGACAATCTGTGTGGGCACTCGAAGATACGGATTCTTAACGTCGCAAGACGAAAAA 
TGAATACCAAGTCTCAAGAGTGAACACGTAATTCATTACGAAGTTTAATTCTTTGAG 
CGTCAAACTTTTAAATTGAAGAGTTTGATCATGGCTCAGATTGAACGCTGGCGGCAG 
GCCTAACACATGCAAGTCGAACGGTAACAGGAAGAAGCTTGCTTCTTTGCTGACGAG 
20 TGGCGGACGGGTGAGTAATGTCTGGGAAACTGCCTGATGGAGGGGGATAACTACTGG 
AAACGGTAGCTAATACCGCATAACGTCGCAAGACCAAAGAGGGGGACCTTCGGGCCT 
CTTGCCATCGGATGTGCCCAGATGGGATTAGCTAGT 



25 

Example of an archea permanent connectron - H. pylori 

In this example the existence of the T1-T2 (812-882) 
long loop is controlled by a C1/C2 short loop (1241) . 
30 The expression of this C1/C2 short loop is controlled 
only by the gene HP1535. 

12 41 Chromosome 1 
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I Chromosome 1 | 

812 882 



A double stranded DNA loop of length 96.385 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
10 whose identifier is 812. This Tl control element has the 
DNA sequence 

Seq. Id. = 146 Position = 1 to 43 

1 5 TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 882. This T2 
control element has the DNA sequence 

20 

Seq. Id. = 147 Position = 1 to 43 

TAGCGGAACTAAAGCATTCATCCCAAACACTAAAGATATTTGG 

25 This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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HP1056 HP1058 HP106G HP1065 HPtRNA-Ser 

HP1066 HP1067 HP1069 HP1070 HP1074 

HP1075 HP1076 HP1077 HP1078 HP1079 

HP1080 HP1081 HP1083 HP1084 HP1085 

HP1088 HP1091 HP1092 HP1093 HP1094 
HP1095 HP1096 

The expression of genes in this T1/T2 long loop is 

controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
1241 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene HP1535 and has 
the DNA sequence 

Seq. Id. = 148 Position = 1 to 56 

TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCATTCATCCCAAACA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 148 Position = 1 to 43 
TTTTACTCATAGGGTTTTTATAGTTCCTAGCGGAACTAAAGCA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 148 Position = 28 to 56 
TAGCGGAAC TAAAGC ATTC ATC C C AAAC A 
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Example of a single-celled permanent connectron - S. 
cervesiae 

In this example the existence of the T1-T2 (5515-5533) 
long loop is controlled by a C1/C2 short loop (6102). 
The expression of this C1/C2 short loop is controlled 
only by the gene YNL339C. 

6102 Chromosome 14 



| Chr omo s ome 1 2 | 

5515 5533 



A double stranded DNA loop of length 6.466 kilo-bases on 
chromosome 12 is bounded on the left by a Tl sequence 
whose identifier is 5515. This Tl control element has 
the DNA sequence 

Seq. Id. = 149 Position = 1 to 225 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 5533. This T2 
control element has the DNA sequence 
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Seq. Id. = 150 Position = 1 to 225 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

YLR467W 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 14 whose identifier is 
6102 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene YNL339C and has 
the DNA sequence 

Seq. Id. = 151 Position = 1 to 252 

AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTGAAC 
ATCCGGGTAAGAGACAACAGGGCT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 151 Position = 1 to 225 
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AGGAAATTGTTGTTACGAAAGTCAGTGATTATGTATTGTGTAGTATAGTATATTGTA 
AGAAATTTTTTTTTCTAGGGAATATGCGTTTTGATGTAGTAGTATTTCACTGTTTTG 
ATTTAGTGTTTGTTGCACGGCAGTAGCGAGAGACAAGTGGGAAAGAGTAGGATAAAA 
AGACAATCTATAAAAAGTAAACATAAAATAAAGGTAGTAAGTAGCTTTTGGTTG 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 151 Position = 28 to 252 

ATTATGTATTGTGTAGTATAGTATATTGTAAGAAATTTTTTTTTCTAGGGAATATGC 
GTTTTGATGTAGTAGTATTTCACTGTTTTGATTTAGTGTTTGTTGCACGGCAGTAGC 
GAGAGACAAGTGGGAAAGAGTAGGATAAAAAGACAATCTATAAAAAGTAAACATAAA 
ATAAAGGTAGTAAGTAGCTTTTGGTTGAACATCCGGGTAAGAGACAACAGGGCT 



Example of a multi-celled permanent connectron - C. 
elegans 



In this example the existence of the T1-T2 (5515-5533) 
long loop is controlled by a C1/C2 short loop (6102). 
The expression of this C1/C2 short loop is controlled 
only by the gene YNL3 39C. 

24442 Chromosome 5 



| Chromosome 1 | 

569 596 
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A double stranded DNA loop of length 30.606 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 569. This Tl control element has the 
DNA sequence 

Seq. Id. = 152 Position = 1 to 239 
AAATCGAGCCCGTAAATCGACACAAGCGCTACAGTAGTC 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 596. This T2 
control element has the DNA sequence 

Seq. Id. = 153 Position = 1 to 42 
AGTGCTACAGTAGTCATTTAAAGAATTACTGTAGTTTTCGCT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 
24442 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene F2 0D6.4 and has 
the DNA sequence 

Seq. Id. = 154 Position = 1 to 58 

GAGCCCGTAAATCGACACAAGCGCTACAGTAGTCATTTAAAGAATTACTGTAGTTTT 
C 

The match between the Tl sequence and the C1/C2 sequence 
is 
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Seq. Id. = 154 Position = 1 to 34 
GAGCCCGTAAATCGACACAAGCGCTACAGTAGTC 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 154 Position = 23 to 58 
GCTACAGTAGTCATTTAAAGAATTACTGTAGTTTTC 
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7. Transient connectrons exist in prokaryotes, 
archea, single-celled eukaryotes and mult i -celled 
eukaryotes. 

A class of connectron relationships exist that permit one 
C1/C2 short loop to control the existence of one or more 
T1-T2 long loops such that this C1/C2 short loop is 
itself subj ect to expression control by another T1-T2 
long loop which surrounds it. These connectron 

relationships are described as "transient". Transient 
connectrons exist in prokaryotes, archea, single-celled 
eukaryotes and multi-celled eukaryotes. 

Example of a prokaryote transient connectron - E. coli 

In this example the existence of the T1-T2 (3227-3329) 
long loop is controlled by the C1/C2 (3225) short loop. 
The expression of this C1/C2 short loop is controlled by 
the existence of the T1-T2 (3216-3224) long loop. The 
existence of this T1-T2 long loop is itself determined by 
the expression of the C1/C2 (3223) short loop. The C1/C2 
(322 5) short loop is the transient connectron. 



322 3 Chromosome 1 



| Chromosome 1 | 

3216 3324 
| 3225 I 



322 5 Chromosome 1 
I 

* * * 

| Chromosome 1 | 

32 27 3329 
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A double stranded DNA loop of length 93.464 kilo-bases on 
5 chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3216. This Tl control element has 
the DNA sequence 

Seq. Id. = 155 Position = 1 to 337 

10 

AGCGCAAGCGAAGCTCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACTATAACGG 
TCCTAAGGTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGCACGAATGGCGTAAT 
GATGGCCAGGCTGTCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGAAGATGC 
AGTGTACCCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACTG 
15 AACATTGAGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGAAGTGTGGACGCCAGTC 
TGCATGGAGCCGACCTTGAAATACCACCCTTTAATGTTTGATGTTCTAACGT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3324. This T2 
20 control element has the DNA sequence 

Seq. Id. = 156 Position = 1 to 337 

CCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGG 
25 GTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCCAGGCTGTCTCCACCCGAGAC 
TCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTACCCGCGGCAAGACGGAAAGAC 
CCCGTGAACCTTTACTATAGCTTGACACTGAACATTGAGCCTTGATGTGTAGGATAG 
GTGGGAGGCTTTGAAGTGTGGACGCCAGTCTGCATGGAGCCGACCTTGAAATACCAC 
CCTTTAATGTTTGATGTTCTAACGTTGACCCGTAATCCGGGTTGCGGACAGT 

30 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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A C1/C2 short loop on chromosome 1 whose identifier is 
3225 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene rrlC and has the DNA sequence 

Seq. Id. = 157 Position = 1 to 137 

AAACAGAATTTGCCTGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAAC 
TCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTA 
GGGAACTGCCAGGCATCAAATTA 
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The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
3 323 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3'UTR to the gene rrlA and has the 
DNA sequence 

Seq. Id. = 158 Position = 1 to 362 

GCGAAGCTCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAG 
GTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCC 
AGGCTGTCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTAC 
CCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACTGAACATTG 
AGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGAAGTGTGGACGCCAGTCTGCATGG 
AGCCGACCTTGAAATACCACCCTTTAATGTTTGATGTTCTAACGTAACGTTGACCCG 
TAATCCGGGTTGCGGACAGT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 158 Position = 1 to 330 

GCGAAGCTCTTGATCGAAGCCCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAG 
GTAGCGAAATTCCTTGTCGGGTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCC 
AGGCTGTCTCCACCCGAGACTCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTAC 
CCGCGGCAAGACGGAAAGACCCCGTGAACCTTTACTATAGCTTGACACTGAACATTG 
AGCCTTGATGTGTAGGATAGGTGGGAGGCTTTGAAGTGTGGACGCCAGTCTGCATGG 
AGCCGACCTTGAAATACCACCCTTTAATGTTTGATGTTCTAACGT 

The match between the T2 sequence and the C1/C2 sequence 
is 
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Seq. Id. = 158 Position = 21 to 362 

CCCGGTAAACGGCGGCCGTAACTATAACGGTCCTAAGGTAGCGAAATTCCTTGTCGG 
5 GTAAGTTCCGACCTGCACGAATGGCGTAATGATGGCCAGGCTGTCTCCACCCGAGAC 
TCAGTGAAATTGAACTCGCTGTGAAGATGCAGTGTACCCGCGGCAAGACGGAAAGAC 
CCCGTGAACCTTTACTATAGCTTGACACTGAACATTGAGCCTTGATGTGTAGGATAG 
GTGGGAGGCTTTGAAGTGTGGACGCCAGTCTGCATGGAGCCGACCTTGAAATACCAC 
CCTTTAATGTTTGATGTTCTAACGTTGACCCGTAATCCGGGTTGCGGACAGT 

10 



A double stranded DNA loop of length 93.749 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
15 whose identifier is 3227. This Tl control element has 
the DNA sequence 

Seq. Id. = 159 Position = 1 to 52 

20 AGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGG 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3329. This T2 
control element has the DNA sequence 

25 

Seq. Id. = 160 Position = 1 to 52 

CATGCGAGAGTAGGGAACTGCCAGGCATCAAATAAAACGAAAGGCTCAGTCG 

30 This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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hemD 


cyaA 


cyaY 


b3808 


dapF 


uvrD 


b3814 


corA 


yigF 


yigG 


rarD 


yigi 


pldA 


recQ 


yigj 


yigK 


pldB 


yigL 


yigM 


metR 


metE 


ysgA 


udp 


yigN 


ubiE 


yigP 


b3836 


yigU 


yigW_l 


rfaH 


yigC 


ubiB 


fadA 


fadB 


pepQ 


trkH 


hemG 


rrsA 


ileT 


rrlA 


rrf A 



The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 

A C1/C2 short loop on chromosome 1 whose identifier is 
3225 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene rrlC and has the 
25 DNA sequence 

Seq. Id. = 161 Position = 1 to 137 

AAACAGAATTTGCCTGGCGGCCGTAGCGCGGTGGTCCCACCTGACCCCATGCCGAAC 
30 TCAGAAGTGAAACGCCGTAGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTA 
GGGAACTGCCAGGCATCAAATTA 
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The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 161 Position = 76 to 127 

AGCGCCGATGGTAGTGTGGGGTCTCCCCATGCGAGAGTAGGGAACTGCCAGG 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 161 Position = 103 to 135 
CATGCGAGAGTAGGGAACTGCCAGGCATCAAAT 



Example of an archea transient connectron - M. jannaschii 

In this example the existence of the T1-T2 (1139-1159) 
long loop is controlled by the C1/C2 (533) short loop. 
The expression of this C1/C2 short loop is controlled by 
the existence of the T1-T2 (532-622) long loop. The 
existence of this T1-T2 long loop is itself determined by 
the expression of the C1/C2 (1629) short loop. The C1/C2 
(533) short loop is the transient connectron. 

1629 Chromosome 1 



| Chr omo s ome 1 | 

532 622 
I 533 | 

533 Chromosome 1 

I 
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I Chromosome 1 | 

1139 1159 



A double stranded DNA loop of length 78.672 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 532. This Tl control element has the 
DNA sequence 

Seq. Id. = 162 Position = 1 to 33 
ATATGTTTGAAATTTGAAAATAAGAGTATTTAG 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 622. This T2 
control element has the DNA sequence 

Seq. Id. = 163 Position = 1 to 47 

TTGAAAATAAGAGCATTTAGAAGTTATTAATTAGTTCAAAGGATTTT 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



MJ0486 


MJ0487 


MJ0488 


MJ0489 


MJ0490 


MJ0492 


MJ0493 


MJ0494 


MJ0495 


MJ0496 


MJ0497 


MJ0499 


MJ0500 


MJ0501 


MJ0502 


MJ0503 


MJ0504 


MJ0506 


MJ0507 


MJ0508 


MJ0509 


MJ0510 


MJ0511 


MJ0512 


MJ0513 


MJ0514 


MJ0514 


r MJ0517 


MJ0519 


MJ0520 


MJ0521 


MJ0522 


MJ0523 


MJ0525 


MJ0526 


MJ052 6 


MJ0529 


MJ0530 


MJ0531 


MJ0532 


MJ0534 


MJ0535 


MJ0536 


MJ0538 


MJ0539 
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MJ0540 MJ0541 MJ0542 MJ0543 MJ0544 

MJ0545 MJ0547 MJ0548 MJ0549 MJ0550 

MJ0552 MJ0553 MJ0554 MJ0555 MJ0556 

MJ0557 MJ0558 MJ0559 MJ0560 MJ0561 

MJ0562 MJ0563 MJ0564 MJ0565 MJ0566 

MJ0568 MJ0569 MJ0570 

This long T1/T2 double stranded DNA loop modulates the 

expression of the following C1/C2 short loops 



A C1/C2 short loop on chromosome 1 whose identifier is 
533 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 * UTR to the 
gene MJ0485 and has the DNA sequence 

Seq. Id. = 164 Position = 1 to 64 

ATTTTTATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTA 
TTGAATT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
1629 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene MJ1597 and has 
the DNA sequence 



Seq. Id. = 165 Position = 1 to 139 
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ATATGTTTGAAATTTGAAAATAAGAGTATTTAGAAGTTATTAATTAGTTCAAAGGAT 
TTTTATTTAATTTCTAAGGGTTTGCTGGTTTGATTATTTAGAATATTTGAGTTTATT 
GAATTATTCAGATTTTTAAAAATTA 

5 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 165 Position = 1 to 33 

1 0 ATATGTTTGAAATTTG AAAATAAGAGTATTTAG 

The match between the T2 sequence and the C1/C2 sequence 
is 

15 Seq. Id. = 165 Position = 33 to 60 
ATTTAGAAGTTATTAATTAGTTCAAAGGATTTT 



20 

A double stranded DNA loop of length 14.509 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 1139. This Tl control element has 
the DNA sequence 

25 

Seq. Id. = 166 Position = 1 to 78 

ATTTATTAATTAGTTCAAAGGATTTTTATTTAATTTCTAAGGGTTAGCTGGTTTGAT 
TGTTTAAAATATTTGAGTTTA 

30 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1159. This T2 
control element has the DNA sequence 
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Seq. Id. = 167 Position = 1 to 78 



ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
TATTCAGATTTTTAAAAATTA 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

MJ109 6 MJ1097 tRNA-Arg-3 MJ1098 MJ1099 

MJ1100 MJ1101 MJ1102 MJ1103 MJ1104 

MJ1105 MJ1106 MJ1107 MJ1108 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
533 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene MJ0485 and has 
the DNA sequence 

Seq. Id. = 168 Position = 1 to 64 

ATTTTTATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTA 
TTGAATT 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 168 Position = 1 to 37 
ATTTTTATTTAATTTCTAAGGGTTAGCTGGTTTGATT 
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The match between the T2 sequence and the C1/C2 sequence 
is 

5 Seq. Id. = 168 Position = 7 to 64 

ATTTAATTTCTAAGGGTTAGCTGGTTTGATTATTTAGAATATTTGAGTTTATTGAAT 
T 



Example of a single-celled transient connectron - S. 
cervesiae 

15 In this example the existence of the T1-T2 (2840-2859) 
long loop is controlled by the C1/C2 (298) short loop. 
The expression of this C1/C2 short loop is controlled by 
the existence of the T1-T2 (293-320) long loop. The 
existence of this T1-T2 long loop is itself determined by 

20 the expression of the C1/C2 (86) short loop. The C1/C2 
(298) short loop is the transient connectron. 



25 86 Chromosome 1 



10 



* 



Chromosome 1 



293 



320 



30 



298 



298 Chromosome 1 



35 



Chr omo s ome 7 



2840 



2859 
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A double stranded DNA loop of length 38.470 kilo-bases on 
5 chromosome 2 is bounded on the left by a Tl sequence 
whose identifier is 293. This Tl control element has the 
DNA sequence 

Seq. Id. = 169 Position = 1 to 258 

10 

GAATTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAATA 
TATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTGTCATCGAAG 
TTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAGGATAATGAAACATATAAA 
ACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 
1 5 ATTCCTATATCCTTGAGGAGAACTTCTAGT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 320. This T2 
control element has the DNA sequence 

20 

Seq. Id. = 170 Position = 1 to 70 



AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAG 
AACTTCTAGTATATTCTGTA 

25 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



YBL005W-B 
30 YBL001C 
YBR005W 
YBR010W 



TS (AGA) B 
YBR001C 
YBR006W 
YBR011C 



YBL004W 
YBR002C 
YBR007C 
YBR012C 



YBL003C 
YBR0 03W 
YBR0 08C 



YBL002W 
YBR004C 
YBR009C 
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This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 2 whose identifier is 
5 298 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene YBL005W-B and has the DNA sequence 

10 Seq. Id. = 171 Position = 1 to 342 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTATCAACTA 
ATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTAT 
GAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAG 
1 5 GATAATGAAAC ATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAAT 
ATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAACTTCTAGTATATTCTGT 
ATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 

The expression of genes in this T1/T2 long loop is 
20 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 8 6 
controls the expression of the genes in this T1/T2 long 
loop. This C1/C2 short loop is expressed as a RNA single 
25 strand that is 3 ' UTR to the gene YAR009C and has the DNA 
sequence 

Seq. Id. = 172 Position = 1 to 362 

30 ATCTATTACATTATGGGTGGTATGTTGGAATAGAAATCAACTATCATCTACTAACTA 
GTATTTACATTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAATGATG 
AGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGCAAGGATTGATAATGTAATAGG 
ATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTA 
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GAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATAT 
TCTGTATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAA 
CATTCACCCATTTCTCAGAA 

5 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 172 Position = 184 to 264 

10 AAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATT 
CCATTTTGAGGATTCCTATATCCT 

The match between" the T2 sequence and the C1/C2 sequence 
is 

15 

Seq. Id. = 172 Position = 215 to 291 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAG 
AACTTCTAGTATATTCTGTA 

20 



A double stranded DNA loop of length 5.3 02 kilo-bases on 
chromosome 7 is bounded on the left by a Tl sequence 
25 whose identifier is 2840. This Tl control element has 
the DNA sequence 

Seq. Id. = 173 Position = 1 to 313 

30 TCTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAATATA 
TTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTGTCATCGAAGTT 
AGAGGAAGCTGAAACGCAAGGATTGATAATGTAATAGGATCAATGAATATAAACATA 
TAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTT 
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GAGGATTCCTATATCCTCGAGGAGAACTTCTAGTATATTCTGTATACCTAAATTATA 
GCCTTTATCAACAATGGAATCCCAACAA 

This double stranded DNA loop is bounded on the right by 
5 a T2 control element whose identifier is 2859. This T2 
control element has the DNA sequence 

Seq. Id. = 174 Position = 1 to 314 

10 CTATCAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGATGA 
CATAAGTTATGAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAACGCAAGGATTGAT 
AATGTAATAGGATCAATGAATATAAACATATAAAACGGAATGAGGAATAATCGTAAT 
ATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAGAAC 
TTCTAGTATATTCTGTATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAAC 

1 5 AATTATCTC AAC ATTC AC ATATTTCTC AT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 A C1/C2 short loop on chromosome 2 whose identifier is 
298 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YBL005W-B and has 
the DNA sequence 

25 

Seq. Id. = 175 Position = 1 to 342 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTATCAACTA 
ATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTAT 
30 GAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAG 
GATAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAAT 
ATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAACTTCTAGTATATTCTGT 
ATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 
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The match between the Tl sequence and the C1/C2 sequence 
is 

5 Seq. Id. = 175 Position = 23 to 147 

TGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAATATATT 
ATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTGTCATCGAAGTTAG 
AGGAAGCTGAA 

10 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 175 Position = 48 to 146 

15 

CTATCAACTAATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGATGA 
C AT AAGTT ATG AG AAGC TGTC ATC G AAGTT AG AGG AAGC TG AA 



20 

Example of a multi-celled transient connectron - C. 
elegans 

In this example the existence of the T1-T2 (22072-22108) 
25 long loop is controlled by the C1/C2 (12 5) short loop. 
The expression of this C1/C2 short loop is controlled by 
the existence of the T1-T2 (110-129) long loop. The 
existence of this T1-T2 long loop is itself determined by 
the expression of the C1/C2 (16859) short loop. The 
30 C1/C2 (125) short loop is the transient connectron. 

16859 Chromosome 4 

I 

* * * 
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I Chromosome 1 | 

110 129 
I 125 I 



12 5 Chromosome 1 



| Chr omo s ome 5 | 

10 22072 22108 



A double stranded DNA loop of length 18.855 kilo-bases on 
15 chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 110. This Tl control element has the 
DNA sequence 

Seq. Id. = 176 Position = 1 to 33 

20 

AGC TTAGGC TTAAGC TTAGGC TTAAGCTTAGGC 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 129. This T2 
25 control element has the DNA sequence 

Seq. Id. = 177 Position = 1 to 2123 

TTCTCCCGCATTTTTTGTAGATCTACGTAGATCAAACCGAAATGAGGCACTTTCTGA 
30 ATCCACGAGCTAGGCTTAAGCTTAGGCTTAAGCTTAGGCCTTTTCTCAGGCTTAGGC 
TTAGGC TTA 

This long T1/T2 double stranded DNA loop modulates the 
expression of" the following genes 

35 

ZC123.3 ZC123.2 
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This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 
5 125 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene ZC123.3 and has the DNA sequence 

10 Seq. Id. = 178 Position = 1 to 89 

ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTCGGCAAAC 
TC TTTC ATTTC AATTT ATG AGGG AAGC C AG AA 

15 The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 
16859 controls the expression of the genes in this T1/T2 
20 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene F58E2.7 and has 
the DNA sequence 

Seq. Id. = 179 Position = 1 to 166 

25 

CTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTT 
AAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGG 
CTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGGCTTAAGCTTAGACTTA 

30 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 179 Position = 11 to 43 
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AGC TTAGGC TT AAGC TT AGGC TT AAGC TTAGGC 

The match between the T2 sequence and the C1/C2 sequence 
5 is 

Seq. Id. = 179 Position = 3 to 33 
TAGGCTTAAGCTTAGGCTTAAGCTTAGGC 

10 



A double stranded DNA loop of length 51.031 kilo-bases on 
chromosome 5 is bounded on the left by a Tl sequence 
15 whose identifier is 22072. This Tl control element has 
the DNA' sequence 

Seq. Id. = 180 Position = 1 to 57 
20 CGCAACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGACCTAGTTCGGC 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 22108. This T2 
control element has the DNA sequence 

25 

Seq. Id. = 181 Position = 1 to 170 

TGACAATCGCCTGCCGGACAACGCGTGGAAAAGTGTCGTGTACTCCACACGGACAAA 
TACATTTAGTTTTACAACTAAAATCGAACCGCGACGCGACACGCAACGCGACGTAAA 
30 TCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTCGGCAAACTCTTCTATTTC 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 
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F36H9.3 F36H9.4 F36H9.5 F36H9.2 F36H9.1 

F3 6H9.6 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
125 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene ZC123.3 and has 
the DNA sequence 

Seq. Id. = 182 Position = 1 to 89 

ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTCGGCAAAC 
TCTTTCATTTCAATTTATGAGGGAAGCCAGAA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 182 Position = 1 to 41 
ACGCGCCGTAAATCTACCCCAGATATGGCCGAGCCAAAATG 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 182 Position = 7 to 61 

CGTAAATCTACCCCAGATATGGCCGAGCCAAAATGGCCTAGTTCGGCAAACTCTT 
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8. Self-limiting connectrons occur in 
prokaryotes, archea, single-celled eukaryotes and 
multi-celled eukaryotes 

A class of connectron relationships exist that permit one 
C1/C2 short loop to control the existence of the T1-T2 
long loop that surrounds it. These connectron 

relationships are described as "self -limiting" . Self- 
limiting connectrons exist in prokaryotes, archea, 
single-celled eukaryotes and multi-celled eukaryotes. 

Example of a prokaryotic self -limiting connectrons - E. 
coli 

In this example the existence of the T1-T2 (1704-1718) 
long loop is controlled by two C1/C2 (1705 and 1713) 
short loops. The expression of these C1/C2 short loops 
is controlled by the existence of the T1-T2 (1704-1718) 
long loop. The existence of this T1-T2 long loop is 
itself determined by the expression of the two C1/C2 
(1705 and 1713) short loops. The C1/C2 (1705 and 1713) 
short loops are the self -limiting connectrons. 

17 05 Chr omo s ome 1 
1713 Chr omo some 1 

I 

★ * 

| Chromosome 1 

1704 

| 1705 1713 



A double stranded DNA loop of length 15.259 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 



I 

1718 
I 
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whose identifier is 1704. This Tl control element has 
the DNA sequence 

Seq. Id. = 183 Position = 1 to 71 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATC 
CGTATGTCACTGGT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1718.. This T2 
control element has the DNA sequence 

Seq. Id. = 184 Position = 1 to 71 

TTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACTGGTTCGAGTCCAGTCA 
GAGGAGCCAAATTC 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 

asnT bl978 bl979 bl980 shiA 

amn bl983 asnW yeeO 

asnU 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 
1705 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene and has the DNA sequence 
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Seq. Id. = 185 Position = 1 to 98 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATC 
CGTATGTCACTGGTTCGAGTCCAGTCAGAGGAGCCAAATTC 

A C1/C2 short loop on chromosome 1 whose identifier is 
1713 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene asnW and has the DNA sequence 

Seq. Id. = 186 Position = 1 to 86 

CACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACT 
GGTTCGAGTCCAGTCAGAGGAGCCAAATT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
1705 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene and has the DNA 
sequence 

Seq. Id. = 187 Position = 1 to 98 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATC 
CGTATGTCACTGGTTCGAGTCCAGTCAGAGGAGCCAAATTC 

The match between the Tl sequence and the C1/C2 sequence 
is 
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Seq. Id, = 187 Position = 1 to 71 

CGCCCCGTTCACACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATC 
CGT ATGTC AC TGGT 

5 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 187 Position = 28 to 98 

10 

TTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACTGGTTCGAGTCCAGTCA 
GAGGAGCCAAATTC 

A C1/C2 short loop on chromosome 1 whose identifier is 
15 1713 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene asnW and has the 
DNA sequence 

20 Seq. Id. = 188 Position = 1 to 86 

CACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACT 
GGTTCGAGTCCAGTCAGAGGAGCCAAATT 

25 The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 188 Position = 1 to 60 

30 CACGATTCCTCTGTAGTTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACT 
GGT 
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The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 188 Position = 17 to 86 

5 

TTCAGTCGGTAGAACGGCGGACTGTTAATCCGTATGTCACTGGTTCGAGTCCAGTCA 
GAGGAGCCAAATT 



10 

Example of a archea sel f- limiting connectrons - M. 
j annaschii 

In this example the existence of the T1-T2 (1447-1471) 
15 long loop is controlled by two C1/C2 (1448 and 1470) 
short loops. The expression of these C1/C2 short loops 
is controlled by the existence of the T1-T2 (1447-1471) 
long loop. The existence of this T1-T2 long loop is 
itself determined by the expression of the two C1/C2 
20 (1705 and 1713) short loops. The C1/C2 (1448 and 1470) 
short loops are the self -limiting connectrons. 



1448 Chromosome 1 
147 0 Chromosome 1 

★ * * 

| Chr omo s ome 1 | 

1447 1471 

| 1448 1470 | 

30 



A double stranded DNA loop of length 22.675 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
35 whose identifier is 1447. This Tl control element has 
the DNA sequence 
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Seq. Id. = 189 Position = 1 to 95 



TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTACCATTA 
5 CTTGGAAATCTATTTAAAACCTCTTTAATCTTATGATA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1471. This T2 
control element has the DNA sequence 

10 

Seq. Id. = 190 Position = 1 to 95 



15 



CAACTAACAACCGTATCGAATTTACCATTACTTGGAAATCTATTTAAAACCTCTTTA 
ATCTTGTGATAATAAATTCTAATCGATTCGTGACTTAT 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



20 



MJ1402 
MJ1407 
MJ1412 
MJ1417 



MJ1403 
MJ1408 
MJ1413 
MJ1418 



MJ1404 
MJ1409 
MJ1414 
MJ1419 



MJ1405 
MJ1410 
MJ1415 
MJ142 0 



MJ1406 
MJ1411 
MJ1416 



This long T1/T2 double stranded DNA loop modulates the 
25 expression of the following C1/C2 short loops 



A C1/C2 short loop on chromosome 1 whose identifier is 
1448 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
30 expressed as a RNA single strand that is 3 ' UTR to the 
gene MJ1401 and has the DNA sequence 



Seq. Id. = 191 Position = 1 to 122 
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TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTACCATTA 

CTTGGAAATCTATTTAAAACCTCTTTAATCTTATGATAATAAATTCTAATCGATTCG 
TGACTTAT 

5 

A C1/C2 short loop on chromosome 1 whose identifier is 
147 0 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
10 gene MJ142 0 and has the DNA sequence 

Seq. Id. = 192 Position = 1 to 116 

TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTACCATTA 
15 CTTGGAAATCTATTTAAAACCTCTTTAATCTTGTGATAATAAATTCTAATCGATTCG 
TG 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 

A C1/C2 short loop on chromosome 1 whose identifier is 
1470 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene MJ1420 and has 
25 the DNA sequence 

Seq. Id. = 193 Position = 1 to 116 

TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTACCATTA 

30 CTTGGAAATCTATTTAAAACCTCTTTAATCTTGTGATAATAAATTCTAATCGATTCG 
TG 
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The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 193 Position = 1 to 89 

5 

TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTACCATTA 
CTTGGAAATCTATTTAAAACCTCTTTAATCTT 

The match between the T2 sequence and the C1/C2 sequence 
10 is 

Seq. Id. = 193 Position = 28 to 116 

CAACTAACAACCGTATCGAATTTACCATTACTTGGAAATCTATTTAAAACCTCTTTA 
1 5 ATCTTGTGATAATAAATTCTAATCGATTCGTG 

A C1/C2 short loop on chromosome 1 whose identifier is 
1448 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
20 single strand that is 3 ' UTR to the gene MJ1401 and has 
the DNA sequence 

Seq. Id. = 194 Position = 1 to 122 

25 TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTACCATTA 
CTTGGAAATCTATTTAAAACCTCTTTAATCTTATGATAATAAATTCTAATCGATTCG 
TGACTTAT 

The match between the Tl sequence and the C1/C2 sequence 
30 is 

Seq. Id. = 194 Position = 1 to 95 
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TTATAGAACATTATGAAGCTTTTTACTCAACTAACAACCGTATCGAATTTACCATTA 
C TTGG AAATC T ATTT AAAAC C TC TTT AATC TT AT GAT A 

The match between the T2 sequence and the C1/C2 sequence 
5 is 

Seq. Id. = 194 Position = 29 to 99 

CAACTAACAAC CGTATCGAATTTAC C ATTACTTGGAAATCTATTT AAAAC CTCTTTA 
10 ATCTT 



Example of a single-celled self -limiting connectron - S. 
15 cervesiae 



In this example the existence of the T1-T2 (293-320) 
long loop is controlled by C1/C2 (298) short loop. The 
expression of this C1/C2 short loop is controlled by the 
20 existence of the T1-T2 (293-320) long loop. The 
existence of this T1-T2 long loop is itself determined by 
the expression of the C1/C2 (298) short loop. The C1/C2 
(298) short loop is the self -limiting connectron. 



25 298 Chromosome 2 



| Chr omo s ome 2 | 

293 320 
30 I 298 I 



A double stranded DNA loop of length 38.470 kilo-bases on 
35 chromosome 2 is bounded on the left by a Tl sequence 
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whose identifier is 293. This Tl control element has the 
DNA sequence 



Seq. Id. = 195 Position = 1 to 258 

5 

GAATTGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAATA 
TATTATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTGTCATCGAAG 
TTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAGGATAATGAAACATATAAA 
ACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGG 
10 ATTCCTATATCCTTGAGGAGAACTTCTAGT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 320. This T2 
control element has the DNA sequence 

15 

Seq. Id. = 196 Position = 1 to 77 

AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCTCGAGGAG 
AACTTCTAGTATATTCTGTA 

20 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



YBL005W-B 
25 YBL001C 
YBR005W 
YBR010W 



TS (AGA) B 
YBR001C 
YBR006W 
YBR011C 



YBL004W 
YBR0 02C 
YBR007C 
YBR012C 



YBL003C 
YBR003W 
YBR008C 



YBL002W 
YBR004C 
YBR009C 



This long T1/T2 double stranded DNA loop modulates the 
30 expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 2 whose identifier is 
298 controls the expression of the genes of one or more 
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other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene YBL005W-B and has the DNA sequence 

5 Seq. Id. = 5197 Position = 1 to 342 

ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTATCAACTA 
ATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTAT 
GAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAG 
10 GATAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAAT 
ATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAACTTCTAGTATATTCTGT 
ATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 

The expression of genes in this T1/T2 long loop is 
15 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 2 whose identifier is 
298 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
20 single strand that is 3 ' UTR to the gene YBL005W-B and has 
the DNA sequence 

Seq. Id. = 198 Position = 1 to 342 

25 ATCTATTACATTATGGGTGGTATGTTGGAATAAAAATCCACTATCGTCTATCAACTA 
ATAGTTATATTATCAATATATTATCATATACGGTGTTAAGATGATGACATAAGTTAT 
GAGAAGCTGTCATCGAAGTTAGAGGAAGCTGAAGTGCAAGGATTGATAATGTAATAG 
GATAATGAAACATATAAAACGGAATGAGGAATAATCGTAATATTAGTATGTAGAAAT 
ATAGATTCCATTTTGAGGATTCCTATATCCTTGAGGAGAACTTCTAGTATATTCTGT 

30 ATACCTAATATTATAGCCTTTATCAACAATGGAATCCCAACAATTATCTCAACATTC 

The match between the Tl sequence and the C1/C2 sequence 
is 
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Seq. Id. = 198 Position = 23 to 276 

TGTTGGAATAAAAATCCACTATCGTCTATCAACTAATAGTTATATTATCAATATATT 
5 ATCATATACGGTGTTAAGATGATGACATAAGTTATGAGAAGCTGTCATCGAAGTTAG 
AGGAAGCTGAAGTGCAAGGATTGATAATGTAATAGGATAATGAAACATATAAAACGG 
AATGAGGAATAATCGTAATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTC 
C T AT ATC C TTG AGG AG AAC TTC T AGT 

10 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 198 Position = 210 to 259 
15 AATATTAGTATGTAGAAATATAGATTCCATTTTGAGGATTCCTATATCCT 



Example of a multi-celled self -limiting connectron - C. 
20 elegans 



In this example the existence of the T1-T2 (293-320) 
long loop is controlled by C1/C2 (298) short loop. The 
expression of this C1/C2 short loop is controlled by the 
25 existence of the T1-T2 (293-320) long loop. The 
existence of this T1-T2 long loop is itself determined by 
the expression of the C1/C2 (298) short loop. The C1/C2 
(298) short loop is the self -limiting connectron. 



30 1715 5 Chromosome 4 

I 

* * * 

| Chromosome 4 | 

17154 17190 
35 I 17155 I 
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A double stranded DNA loop of length 89.919 kilo-bases on 
5 chromosome 4 is bounded on the left by a Tl sequence 
whose identifier is 17154. This Tl control element has 
the DNA sequence 

Seq. Id. = 199 Position = 1 to 29 

10 

AAATTTCCGGCAAATCGGCAAACTGGCAA 



This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 17190. This T2 
15 control element has the DNA sequence 



Seq. Id. = 200 Position = 1 to 29 



AATTTGCCGATTTGCCGAATTTGTCGACA 

20 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following genes 



R08C7.11 M01H9.2 M01H9.3 M01H9.4 M01H9 . 1 

25 ZK180.1 ZK180.2 ZK180.3 ZK180.4 ZK180.5 

ZK180.6 ZK185.3 ZK185.2 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 



30 



A C1/C2 short loop on chromosome 4 whose identifier is 
17155 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop is 
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expressed as a RNA single strand that is 3 ' UTR to the 
gene R08C7 . 1 and has the DNA sequence 

Seq. Id. = 201 Position = 1 to 56 

5 

AAATTTCCGGCAAATCGGCAAACTGGCAATTTGCCGATTTGCCGAATTTGTCGACA 

A C1/C2 short loop on chromosome 4 whose identifier is 
17171 controls the expression of the genes of one or more 
10 other T1/T2 long loops. This C1/C2 short loop is 
expressed as a RNA single strand that is 3 ' UTR to the 
gene ZK180.2 and has the DNA sequence 

Seq. Id. = 202 Position = 1 to 56 

15 

TGGAAATTTCAGAATTTCAATTTTAATCGGCAAAATTGTACGCATCCTATGAATTT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 

A C1/C2 short loop on chromosome 4 whose identifier is 
17155 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene R08C7.1 and has 
25 the DNA sequence 

Seq. Id. = 203 Position = 1 to 56 

AAATTTCCGGCAAATCGGCAAACTGGCAATTTGCCGATTTGCCGAATTTGTCGACA 

30 

The match between the Tl sequence and the C1/C2 sequence 
is 
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Seq. Id. = 203 Position = 1 to 29 

AAATTTCCGGCAAATCGGCAAACTGGCAA 

5 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 203 Position = 28 to 56 
10 AATTTGCCGATTTGCCGAATTTGTCGACA 
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9. Genel ss connectrons exist in single-celled 
and mult i -celled eukaryotes 

5 Normally T1-T2 long loops contain genes whose expression 
is regulated by the existence of the long loop. When a 
T1-T2 long loop does not contain any genes it is 
described as being "geneless". The existence of the Tl- 
T2 long loop is itself controlled by one or more C1/C2 
10 short loops that may be on the same or different 
chromosomes. The geneless T1-T2 long loops must contain 
one or more C1/C2 short loops. 

Example of a single-celled geneless connectron - S. 
15 cervesiae 

In this example the existence of the T1-T2 (1537-1559) 
long loop is controlled by three C1/C2 (3789, 5289 and 
5753) short loops. The expression of 21 C1/C2 (1538 
20 through 1558) short loops are controlled by the existence 
of the T1-T2 (1537-1559) long loop. 

37 89 Chromosome 9 
52 89 Chromosome 12 
25 57 53 Chromosome 13 

I 

* * 

| Chromosome 4 

1537 

30 | 1538 through 1558 



A double stranded DNA loop of length 4.825 kilo-bases on 
35 chromosome 4 is bounded on the left by a Tl sequence 



I 

1559 
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whose identifier is 1537. This Tl control element has 
the DNA sequence 

Seq. Id. = 204 Position = 1 to 362 

ATGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAAAGGCTA 
TAATATTAGGTATACAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATA 
AAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATG 
TTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATT 
TAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATAC 

TAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACATACCACC 
CATAATGTAATAGATCTAAT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1559. This T2 
control element has the DNA sequence 

Seq. Id. = 205 Position = 1 to 362 

ATGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAAAGGCTA 

TAATATTAGGTATACAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATA 

AAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATG 

TTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATT 

TAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATAC 

TAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACATACCACC 
CATAATGTAATAGATCTAAT 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 



-203- 



A C1/C2 short loop on chromosome 4 whose identifier is 
153 8 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

5 

Seq. Id. = 206 Position = 1 to 387 

ATGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAAAGGCTA 
TAATATTAGGTATACAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATA 
10 AAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATG 
TTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATT 
TAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATAC 
TAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACATACCACC 
CATAATGTAATAGATCTAATGAATCCATTTGTTTGTTAATAGTTT 

15 

This T1-T2 loop also modulates the C1/C2 short loops 
numbered 1539 to 1557 

A C1/C2 short loop on chromosome 4 whose identifier is 
20 1558 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 207 Position = 1 to 307 

25 

AGCTTCTCATAACTTATGTCATCATCTTAACACCGTATATGATAATATATTGATAAT 
ATAACTTGTTGGAATAAAAATCAACTATCATCTACTAACTAGTATTTACGTTACTAG 
TATATTATCATATACGGTGTTAGAAGATGACGCAAATGATGAGAAATAGTCATCTAA 
ATTAGTGGAAGCTGA . . . GTCTATCTGGCGAATATAAATTTTTACGCTACACACGTC 
30 ATCGACATCTAAATATGACAGTCGCTGAACTGTTCTTAGATATCCATGCTATTTATG 
AAGAACAACAGGGATCGAGAAACAG 



-204- 



The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 9 whose identifier is 
5 3789 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YIL059C and has 
the DNA sequence 

10 Seq. Id. = 208 Position = 1 to 176 

TTTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTC 
CACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGA 
TAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAAC 
15 AGTAT 

The match between the Tl sequence and the C1/C2 sequence 
is 

20 Seq. Id. = 208 Position = 1 to 172 

TTTAT ATGTTAAT ATTC ATTGATC C TATT AC ATTATC AATC C TTGCGTTTC AGCTTC 
CACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGA 
TAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAAC 
25 A 

The match between the T2 sequence and the C1/C2 sequence 
is 

30 Seq. Id. = 208 Position = 1 to 172 

TTT AT ATGTT AAT ATTC ATTGATCC TATT AC ATTATC AATC C TTGCGTTTC AGCTTC 
CACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGA 



-205 - 



TAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAAC 
A 

A C1/C2 short loop on chromosome 12 whose identifier is 
5 5289 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RJSTA 
single strand that is 3 1 UTR to the gene YLR301W and has 
the DNA sequence 

10 Seq. Id. =209 Position = 1 to 325 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGT 
ATGTAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCT 
GCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTAATATTCAT 
15 TGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTAT 
TTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAA 
ATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACAC 

The match between the Tl sequence and the C1/C2 sequence 
20 is 

Seq. Id. =209 Position = 62 to 317 

AGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCTGCAA 
25 TTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTAATATTCATTGAT 
CCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCT 
CATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATAC 
TAGTTAGTAGATGATAGTTGATTTTTATTCCAACA 

30 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. =209 Position = 86 to 324 
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AGGATTTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTA 
TTATCATCGTTTTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCG 
TTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACA 
5 CCGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTT 
TTATTC C AAC A 

A C1/C2 short loop on chromosome 13 whose identifier is 
5753 controls the expression of the genes in this T1/T2 
10 long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene YMR044W and has 
the DNA sequence 

Seq. Id. = 210 Position = 1 to 334 

15 

TTGAGAAATGGGGGAATGTTGAGATAATTGTTGGGATTCCATTGTTGATAAAGGCTA 
TAATATTAGGTATAC AG AATATAC T AGAAGTTCTC C TC AAGGATAT AGGAATCC TC A 
AAATGGAATCTATATTTCTACATACTAATATTACGATTATTCCTCATTCCGTTTTAT 
ATGTTTCATTATCCTATTACATTATCAATCCTTGCACTTCAGCTTCCTCTAACTTCG 
20 ATGACAGCTTCTCATAACTTATGTCATCATCTTAACACCGTATATGATAATATATTG 
ATAATATAACTATTAGTTGATAGACGATAGTGGATTTTTATTCCAACAT 

The match between the Tl sequence and the C1/C2 sequence 
is 

25 

Seq. Id. = 210 Position = 22 to 95 

AGATAATTGTTGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATACAGAATA 
TACTAGAAGTTCTCCTC 

30 

The match between the T2 sequence and the C1/C2 sequence 
is 
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Seq. Id. = 210 Position = 28 to 101 

TTGTTGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATACAGAATATACTAG 
AAGTTCTCCTCAAGGAT 

5 



Two examples of multi-celled geneless connectrons - C. 
elegans 

10 

In the first example the existence of the T1-T2 (2342- 
2344) long loop is controlled by the C1/C2 (24114) short 
loop. The expression of one C1/C2 (2343) short loop is 
controlled by the existence of the T1-T2 (2342-2344) long 
15 loop. 



2 4114 Chromosome 5 

I 

* * * 

20 | Chromosome 1 | 

2342 2344 
| 2343 | 



25 



In the second example the existence of the T1-T2 (29221- 
29262) long loop is controlled by the C1/C2 (24114) short 
loop. The expression of one C1/C2 (2343) short loop is 
30 controlled by the existence of the T1-T2 (2342-2344) long 
loop. 



4291 Chromosome 1 
I 

35 * * * 

| Chromosome 5 | 

29221 29262 
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29222 through 29261 



5 A double stranded DNA loop of length 67.059 kilo-bases on 
chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 2342. This Tl control element has 
the DNA sequence 

10 Seq. Id. = 211 Position = 1 to 37 

TGAAAACTACAGTAATTCTTTAAATGACTACTGTAGC 

This double stranded DNA loop is bounded on the right by 
15 a T2 control element whose identifier is 2344. This T2 
control element has the DNA sequence 

Seq. Id. = 212 Position = 1 to 37 

20 CTACTGTAGCGCTTGTGTCGATTTACGGGCTCGATTT 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the 
25 expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 
2343 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
30 DNA sequence 

Seq. Id. = 213 Position = 1 to 61 
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TCGACACAAGCGCTACAGTAGCTATTTAAAGAATTACTGTAGTTTTCGCTACGAGAT 
ATTT 

The expression of genes in this T1/T2 long loop is 
5 controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 5 whose identifier is 
24114 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
10 single strand that is 3 ' UTR to the gene C13F10.5 and has 
the DNA sequence 

Seq. Id. = 214 Position = 1 to 68 

15 GCGAAAACTACAGTAATTCTTTAAATGACTACTGTAGCGCTTGTGTCGATTTACGGG 
CTCGATTTTCG 

The match between the Tl sequence and the C1/C2 sequence 
is 

20 

Seq. Id. = 214 Position = 3 to 38 

GAAAACTACAGTAATTCTTTAAATGACTACTGTAGC 

25 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 214 Position = 29 to 65 
30 CTACTGTAGCGCTTGTGTCGATTTACGGGCTCGATTT 
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A double stranded DNA loop of length 41.297 kilo-bases on 
chromosome 5 is bounded on the left by a Tl sequence 
whose identifier is 29221. This Tl control element has 
the DNA sequence 

Seq. Id. = 215 Position = 1 to 62 

TTTAAATTTCCCGCCAAAAATTGACTGAAAATTTGGATTTTCTTTCCAAAAATTGAC 
AGAAA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 29262. This T2 
control element has the DNA sequence 

Seq. Id. = 216 Position = 1 to 31 

TGAAAATTTGAATTTCCCGCCAAAAATTAAC 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 5 whose identifier is 
29222 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 217 Position = 1 to 58 

AATTTCCCGCCAAAAATTGACTGAAAATTTGGATTTTCTTTCCAAAAATTGACAGAA 
A 
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This T1-T2 loop also modulates the C1/C2 short loops 
numbered 29223 to 29260 

A C1/C2 short loop on chromosome 5 whose identifier is 
29261 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 218 Position = 1 to 54 

AAAATTGACTGAAAATTTGAATTTCCAGCCAAAAATTGACTGAAAATTTGAATT 

The expression of genes in this, T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 1 whose identifier is 
4291 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene Y43F8C.5 and has 
the DNA sequence 

Seq. Id. = 219 Position = 1 to 317 

AAAATTAACTGAAAATTTGAATTTCCCGCCAAAAATTGACTGAAAATTTGAATTTCC 
CGCCAAAAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATTGACTGAAAATTT 
GAATTTCCCGCCAAAAATTAATTGAAAATTTGAATTTCCCGCCAAAAATTAATTGAA 
ACTTTGAATTTTCAA . . . ATTTCCCGCCAAAAATTAATTGAAACTTTGAATTTTCAA 
ATTTCCCGCCAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATTAATTGAAAA 
TTTGAATTTTTGAATTTCCCGCCAAAAATGACTGA 

The match between the Tl sequence and the C1/C2 sequence 
is 
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Seq. Id. - 219 Position = 229 to 260 

AAATTTCCCGCCAAAAATTGACTGAAAATTTG 

5 The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 219 Position = 63 to 104 
10 AAAAAAATTGACTGAAAATTTGAATTTCCCGCCAAAAATTGA 
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10. One connectron controls many geneless 
connectrons in single-celled and multi-celled 
eukaryotes 



One C1/C2 short loop can control the existence of many 
geneless T1-T2 long loops. 

Example of a single- eel led geneless connectron - S . 
cervesiae 

In this example the existence of the three T1-T2 (1142- 
1156, 1242-1272 and 7102-7117) long loops is controlled 
by the C1/C2 (52 89) short loop. 

5289 Chromosome 12 



| Chromosome 4 | 

H42 1156 
| 1143 through 1155 | 



5289 Chromosome 12 

I 



| Chromosome 4 | 

1243 1272 
| 1244 through 12 71 | 



5289 Chromosome 12 

I 



| Chromosome 5 

7102 7117 
| 7103 through 7116 



A double stranded DNA loop of length 5.337 kilo-bases on 
chromosome 4 is bounded on the left by a Tl sequence 
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whose identifier is 1142. This Tl control element has 
the DNA sequence 

Seq. Id. = 220 Position = 1 to 318 

5 

ATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGTATGTA 
GATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCTGCAATT 
CTACACAATTCTATAAATATTATTATCATCATTTTATATGTTAATATTCATTGATCC 
TATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCA 
10 TCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACTA 
GTTAGTAGATGATAGTTGATTTTTATTCCAACA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1156. This T2 
15 control element has the DNA sequence 

Seq. Id. = 221 Position = 1 to 295 

TTTTAATAAGGCAATAATATTAGGTATGTAGATATACTAGAAGTTCTCCTCCAGGAT 
20 TTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATC 
ATCATTTTATATGTTAATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCA 
GCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTA 
TATGATAATATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATT 
CCAACAAGAA 

25 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

30 

A C1/C2 short loop on chromosome 4 whose identifier is 
1143 controls the expression of the genes of one or more 
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other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 222 Position = 1 to 349 

5 

ATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGTATGTA 
GATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCTGCAATT 
CTACACAATTCTATAAATATTATTATCATCATTTTATATGTTAATATTCATTGATCC 
TATTACATTATCAAT . . . CTCTAAGTCTCATTGCCTTTGTGCCAAAAAATCTGTTTC 
10 TAAATTTCTCTTCATTTGTAGACTTAATTATACTGATCGTTGATCTACTATCAGTAA 
GTAAGCCTTTAATAATTGGTTTCTTGTTAAGTTCTTGCACAAGGTGACTGAGGTTAT 
TCAATAGCGG 

This T1-T2 loop also modulates the C1/C2 short loops 
15 numbered 1144 to 1154 

A C1/C2 short loop on chromosome 4 whose identifier is 
1155 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
20 DNA sequence 

Seq. Id. = 223 Position = 1 to 69 

GAGGAGAACTTCTAGTATATCTACATACCTAATATTATTGCCTTATTAAAAATGGAA 
25 TCCCAACAATTA 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

30 A C1/C2 short loop on chromosome 12 whose identifier is 
5289 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
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single strand that is 3 1 UTR to the gene YLR301W and has 
the DNA sequence 

Seq. Id. = 224 Position = 1 to 324 

5 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGT 
ATGT AG AAT AT AC T AG AAGTTC TC C TC G AGG ATTT AGG AATC C AT AAAAGGG AATC T 
GCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTAATATTCAT 
TGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTAT 
10 TTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTACGTAAA 
TACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACAC 

The match between the Tl sequence and the C1/C2 sequence 
is 

15 

Seq. Id. = 224 Position = 6 to 64 

ATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGTATGTA 
GA 

20 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 224 Position = 33 to 64 

25 

TTTTAATAAGGCAATAATATTAGGTATGTAGA 



30 A double stranded DNA loop of length 5.251 kilo-bases on 
chromosome 4 is bounded on the left by a Tl sequence 
whose identifier is 1243. This Tl control element has 
the DNA sequence 
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Seq. Id. = 225 Position - 1 to 366 

CGTGTTTTATCTCATGTTGTTCGTTTTGTTATTGAGATATATGTGGGTAATTAGATA 
5 ATTGTTGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATACAGAATATACTA 
GAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAAT 
TCTATAAATATTATTATCATCGTTTTATATGTTAATATTCATTGATCCTATTACATT 
ATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCG 
TCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGA 

10 TGATAGTTGATTTTTATTCCAACA 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 1272. This T2 
control element has the DNA sequence 

15 

Seq. Id. = 226 Position = 1 to 273 

TGAGATATATGTGGGTAATTAGATAATTGTTGGGATTCCATTGTTGATAAAGGCTAT 
AATATTAGGTATACAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAA 
20 AAGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGT 
TAATATTCATTGATC . . . TATACTAGTAACGTAAATACTAGTTAGTAGATGATAGTT 
GATTTTTATTCCAACAGTTATAAGGTTGTTTCATATGTGTTTTATGAA 

There are no genes controlled by this T1/T2 loop. 

25 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 4 whose identifier is 
30 1244 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 
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Seq. Id. = 227 Position = 1 to 327 

TTTATCTCATGTTGTTCGTTTTGTTATTGAGATATATGTGGGTAATTAGATAATTGT 
TGGGATTCCATTGTTGATAAAGGCTATAATATTAGGTATACAGAATATACTAGAAGT 
5 TCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCTGCAATTCTACACAATTCTAT 
AAATATTATTATCAT . . . GTCTCGATGTAGTATACGTATAAATTATTACCTGATACT 
TCATCTCTAAGTCTCATTGCCTTTGTGCCAAAAAATCTGTTTCTAAATTTCTCTTCA 
TTTGTAGACTTAATTATACTGATCGTTGATCTACTATCAGTAAGT 

10 This T1-T2 loop also modulates the C1/C2 short loops 
numbered 1245 to 1270 

A C1/C2 short loop on chromosome 4 whose identifier is 
1271 controls the expression of the genes of one or more 
15 other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 228 Position = 1 to 309 

20 TGTTGTATCTCAAAATGAGATATGTCAGTATGACAATACGTCATCCTAAACGTTCAT 
AAAACACATATGAAACAACCTTATAACTGTTGGAATAAAAATCAACTATCATCTACT 
AACTAGTATTTACGTTACTAGTATATTATCATATACGGTGTTAGAAGATGACGCAAA 
TGATGAGAAATAGTC . . . CAACAATGGAATCCCAACAATTATCTAATTACCCACATA 
TATCTCATGGTAGCGCCTGTGCTTCGGTTACTTCTAAGGAAGTCCACACAAATCAAG 

25 ATCCGTTAGACGTTTCAGCTTCCAAAA 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

30 A C1/C2 short loop on chromosome 12 whose identifier is 
5289 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
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single strand that is 3 ' UTR to the gene YLR301W and has 
the DNA sequence 

Seq. Id. = 229 Position = 1 to 325 

5 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGT 
ATGTAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCT 
GCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTAATATTCAT 
TGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTAT 
10 TTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAA 
ATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACAC 

The match between the Tl sequence and the C1/C2 sequence 
is 

15 

Seq. Id. = 229 Position = 62 to 317 

AG AAT AT AC T AG AAGTTC TC C TC G AGG ATTT AGG AATC C AT AAAAGGG AATC TGC AA 
TTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTAATATTCATTGAT 
20 CCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCT 
CATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATAC 
TAGTTAGTAGATGATAGTTGATTTTTATTCCAACA 

The match between the T2 sequence and the C1/C2 sequence 
25 is 

Seq. Id. = 229 Position = 62 to 317 

AGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCTGCAA 
30 TTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTAATATTCATTGAT 
CCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCT 
CATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATAC 
TAGTTAGTAGATGATAGTTGATTTTTATTCCAACA 
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A double stranded DNA loop of length 5.296 kilo-bases on 
5 chromosome 15 is bounded on the left 

by a Tl sequence whose identifier is 7102. This Tl 
control element has. the DNA sequence 

Seq. Id. = 230 Position = 1 to 365 

10 

CATGATTAATATGACCAATCGGCGTGTGTTTTTGAAAAGTGGGTGAATTTTGAGATA 
ATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGTATGTAGAATGTACTAG 
AAGTTCTCCTCAAGGATTTAGGAATCCATGAAAGGGAATCTGCAATTCTACACAATT 
CTATAAATATTATTATCATCATTTTATATGTTAATATTCATTGATCCTATTACATTA 
15 TCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTATTTCTCATCATTTGCGT 
CATCTTCTAACACCGTATATGATAATATACTAGTAACGTAAATACTAGTTAGTAGAT 
GATAGTTGATTTTTATTCCAACA 

This double stranded DNA loop is bounded on the right by 
20 a T2 control element whose identifier is 7117. This T2 
control element has the DNA sequence 

Seq. Id. = 231 Position = 1 to 365 

25 TGAAAAGTGGGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATA 
ATATTAGGTATGTAGAATGTACTAGAAGTTCTCCTCAAGGATTTAGGAATCCATGAA 
AGGGAATCTGCAATTCTACACAATTCTATAAATATTATTATCATCATTTTATATGTT 
AATATTCATTGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTA 
GATGACTATTTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTA 

30 GTAACGTAAATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACAGTTTTATAT 
AC C TCTCTTATTTAGTATAAGAA 

There are no genes controlled by this T1/T2 loop. 
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This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

5 A C1/C2 short loop on chromosome 15 whose identifier is 
7103 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

10 Seq. Id. = 232 Position = 1 to 357 

AAGAACATTGCTGATGTGATGACAAAACCTCTTCCGATAAAAACATTTAAACTATTA 
ACTAACAAATGGATTCATTAGATCTATTACATTATGGGTGGTATGTTGGAATAAAAA 
TCAACTATCATCTACTAACTAGTATTTACGTTACTAGTATATTATCATATACGGTGT 
15 TAGAAGATGACGCAAATGATGAGAAATAGTCATCTAAATTAGTGGAAGCTGAAACGC 
AAGGATTGATAATGTAATAGGATCAATGAATATTAACATATAAAATGATGATAATAA 
TATTTATAGAATTGTGTAGAATTGCAGATTCCCTTTCATGGATTCCTAAATCCTTGA 
GGAGAACTTCTAGTA 

20 This T1-T2 loop also modulates the C1/C2 short loops 
numbered 7104 to 7115 

A C1/C2 short loop on chromosome 15 whose identifier is 
7116 controls the expression of the genes of one or more 
25 other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 233 Position = 1 to 66 

30 CCATTCTGTGGAGGTGGTACTGAAGCAGGTTGAGGAGAGACATGATGATGGTTCTCT 
GGAACAGCT 
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The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 12 whose identifier is 
5 5289 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene YLR301W and has 
the DNA sequence 

10 Seq. Id. = 234 Position = 1 to 325 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGT 
ATGTAGAATATACTAGAAGTTCTCCTCGAGGATTTAGGAATCCATAAAAGGGAATCT 
GCAATTCTACACAATTCTATAAATATTATTATCATCGTTTTATATGTTAATATTCAT 
15 TGATCCTATTACATTATCAATCCTTGCGTTTCAGCTTCCACTAATTTAGATGACTAT 
TTCTCATCATTTGCGTCATCTTCTAACACCGTATATGATAATATACTAGTAACGTAA 
ATACTAGTTAGTAGATGATAGTTGATTTTTATTCCAACAC 

The match between the Tl sequence and the C1/C2 sequence 
20 is 

Seq. Id. = 234 Position = 1 to 66 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGT 
25 ATGTAGAAT 

The match between the T2 sequence and the C1/C2 sequence 
is 

30 Seq. Id. = 234 Position = 1 to 66 

GGTGAATTTTGAGATAATTGTTGGGATTCCATTTTTAATAAGGCAATAATATTAGGT 
ATGTAGAAT 
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Example of a multi-celled geneless connectron - C. 
5 elegans 

In this example the existence of the three T1-T2 (1142- 
1156, 14840-15042 and 15365-15627) long loops is 
controlled by the C1/C2 (16760) short loop. 

10 

16760 Chromosome 4 



★ * * 

| Chromosome 4 | 

15 1142 115 
| 3103 through 3119 | 

16760 Chromosome 4 

I 

20 * * * 

| Chromosome 4 | 

14840 150 
| 14841 through 15041 | 

25 16760 Chromosome 4 

I 

★ * * 

| Chromosome 5 | 

15365 156 
30 | 15366 through 15625 | 



A double stranded DNA loop of length 15.894 kilo-bases on 
35 chromosome 1 is bounded on the left by a Tl sequence 
whose identifier is 3101. This Tl control element has 
the DNA sequence 

Seq. Id. = 235 Position = 1 to 33 

40 
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CAAATCGGCAAATTGCCGGAATTGAACATTTCC 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 3120. This T2 
control element has the DNA sequence 

Seq. Id. = 236 Position = 1 to 54 

AAACGATTTTTCCGGCAAATCGGCAAATTGCCGGAATTGTAATTTCCGGCAAAT 
There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 1 whose identifier is 
3103 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 237 Position = 1 to 55 

TTAAAATTTCCGGCAAATCGGCAAATTGGCAGAAATGAAACTCACGGCAAATCGG 

This T1-T2 loop also modulates the C1/C2 short loops 
numbered 3104 to 3118 

A C1/C2 short loop on chromosome 1 whose identifier is 
3119 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 238 Position = 1 to 61 



-225- 



CCCGCATTTTTTGTAGATCAAACCGTAATGGGACGGCCTGGCAACACGTGATTTTCC 
AAAT 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 
16760 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene T23E1.2 and has 
the DNA sequence 

Seq. Id. = 239 Position = 1 to 124 

GGCAAATTGCCGAAATTGAACATTTCCGGCAAATCGGCAAATTGCCGGAATTGAACA 

TTTCCGGCAAATCGGCAAATTGCCGGAATTGAACATTTCCGGCAAATCGGCAAATTG 
CCGGAATTGA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 239 Position = 30 to 62 
CAAATCGGCAAATTGCCGGAATTGAACATTTCC 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 239 Position = 23 to 53 
TTTCCGGCAAATCGGCAAATTGCCGGAATTG 
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A double stranded DNA loop of length 86.977 kilo-bases on 
chromosome 3 is bounded on the left by a Tl sequence 
5 whose identifier is 14840. This Tl control element has 
the DNA sequence 

Seq. Id. = 240 Position = 1 to 141 

10 AAAAATTTCCGGCAAGTCGGCAATTTTCCGAAAATGAAAATTTCCGGCAAATCGGCA 
AATTGCCGGAATTGAAAATTCCTGGCAAATCAGCAAATTTGCGGCAAATCGGCAATT 
TGCCGAAAATGAAAATTTCCGGCAAAT 

This double stranded DNA loop is bounded on the right by 
15 a T2 control element whose identifier is 15042. This T2 
control element has the DNA sequence 

Seq. Id. = 241 Position = 1 to 98 

20 CAAATCGGTAGGTAAATTGGCCAAACTTGAAAATTTCCGGCAAATCGGCAAATTCCG 
CGAACTGAACATTTCCGGCAAATCGGCAAATTGCTCGAACT 

There are no genes controlled by this T1/T2 loop. 

25 This long T1/T2 double stranded DNA loop modulates the 
expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 3 whose identifier is 
14841 controls the expression of the genes of one or more 
30 other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 242 Position = 1 to 141 
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AAAAATTTCCGGCAAGTCGGCAATTTTCCGAAAATGAAAATTTCCGGCAAATCGGCA 
AATTGCCGGAATTGAAAATTCCTGGCAAATCAGCAAATTTGCGGCAAATCGGCAATT 
TGCCGAAAATGAAAATTTCCGGCAAAT 

5 

This T1-T2 loop also modulates the C1/C2 short loops 
numbered 14842 to 15040 

A C1/C2 short loop on chromosome 3 whose identifier is 
10 15041 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

Seq. Id. = 243 Position = 1 to 55 

15 

CGGCAATTGCCGTTCGGCAATTTGCCAATTTGCCGGAAATTTTCAATTCCGGCAA 

The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

20 

A C1/C2 short loop on chromosome 4 whose identifier is 
16760 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 1 UTR to the gene T23E1.2 and has 
25 the DNA sequence 

Seq. Id. = 244 Position = 1 to 124 

GGCAAATTGCCGAAATTGAACATTTCCGGCAAATCGGCAAATTGCCGGAATTGAACA 
30 TTTCCGGCAAATCGGCAAATTGCCGGAATTGAACATTTCCGGCAAATCGGCAAATTG 
CCGGAATTGA 
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The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 244 Position = 22 to 55 
ATTTCCGGCAAATCGGCAAATTGCCGGAATTGAA 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 244 Position = 17 to 45 
TGAACATTTCCGGCAAATCGGCAAATTGC 



A double stranded DNA loop of length 98.488 kilo-bases on 
chromosome 3 is bounded on the left by a Tl sequence 
whose identifier is 15365. This Tl control element has 
the DNA sequence 

Seq. Id. = 245 Position = 1 to 336 

AAAATTTCCGGCAAATCGGCAATTTGCCAAAAATTGAAATTTCCGGCAAATCGGCAA 
TTTGTCAAAAATGAAAATTTCCGGCAAATCGGCAAATTGCCGAAAATGAAAATTTCC 
GGCAAATCGGCAAACTTCCGGAACTGAAAATTTCCGGCAAATCGGCAATTTGCCATA 
AATGAACATTTCCGG . . . GGCGAAAATTAAAATTTCCGCCATATCGGCAATTTGCCA 
AAAAATTAAAATTTCCGGCAAATCGGCAAATTGCCGGAATTCAAAATTTCCGGCAAA 
CCGGCAAATTGCCGGAACTCAAAATTCCCGGCAAATCAGCAAATTGCCGGAATT 

This double stranded DNA loop is bounded on the right by 
a T2 control element whose identifier is 15627. This T2 
control element has the DNA sequence 
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Seq. Id. = 246 Position = 1 to 68 

TGGCAAACCGGCAAATTGCCGGAATTGAACATTTCCGGCAAATCGGCAATTTGCCGG 
5 AATTGAAATTT 

There are no genes controlled by this T1/T2 loop. 

This long T1/T2 double stranded DNA loop modulates the 
10 expression of the following C1/C2 short loops 

A C1/C2 short loop on chromosome 3 whose identifier is 
153 66 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
15 DNA sequence 

Seq. Id. = 247 Position = 1 to 60 

TGCCGATTTGCCGGAAATTTTCATTTTCGGCAATTTGCCGATTTGCCGGAAATTTTC 
20 ATT 

This T1-T2 loop also modulates the C1/C2 short loops 
numbered 15366 to 15624 

25 A C1/C2 short loop on chromosome 3 whose identifier is 
15625 controls the expression of the genes of one or more 
other T1/T2 long loops. This C1/C2 short loop has the 
DNA sequence 

30 Seq. Id. = 248 Position = 1 to 54 

TCAAGCAAATTGTCAAATTCGCGGAACTAAACATTTCCGGCAAATCGGCAAATT 
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The expression of genes in this T1/T2 long loop is 
controlled by the following C1/C2 short loops. 

A C1/C2 short loop on chromosome 4 whose identifier is 
16760 controls the expression of the genes in this T1/T2 
long loop. This C1/C2 short loop is expressed as a RNA 
single strand that is 3 ' UTR to the gene T23E1.2 and has 
the DNA sequence 

Seq. Id. = 249 Position = 1 to 124 

GGCAAATTGCCGAAATTGAACATTTCCGGCAAATCGGCAAATTGCCGGAATTGAACA 

TTTCCGGCAAATCGGCAAATTGCCGGAATTGAACATTTCCGGCAAATCGGCAAATTG 
CCGGAATTGA 

The match between the Tl sequence and the C1/C2 sequence 
is 

Seq. Id. = 249 Position = 22 to 52 
ATTTCCGGCAAATCGGCAAATTGCCGGAATT 

The match between the T2 sequence and the C1/C2 sequence 
is 

Seq. Id. = 249 Position = 35 to 75 
CGGCAAATTGCCGGAATTGAACATTTCCGGCAAATCGGCAA 
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Abstract 



An algorithm has been developed to identify four DNA 
sequences of 20 bases or more that form a structure 
5 called a connectron. Two sequences CI and C2 are 
adjacent to each other. These sequences are expressed as 
RNA in the 3 ' UTR of some genes in many prokaryotic, 
archea and eukaryotic genomes. The other half of a 
connectron is two DNA sequences Tl and T2 that are on the 

10 same chromosome and range in distance from each other by 
about lkb to 105kb. The CI sequence is identical to the 
Tl sequence and the C2 sequence is identical to the T2 
sequence. C1/C2 and T1-T2 can be on different 

chromosomes. The C1/C2 RNA sequence of the gene 

15 transcript finds the two double- stranded DNA sequences Tl 
and T2 . The single-stranded RNA and double -stranded DNA 
then form a triple-stranded Hoogsteen helix of the 
RNA / DNA / DNA variety. Because the CI sequence is adjacent 
to the C2 sequence, the Tl sequence is made spatially 

20 adjacent to the T2 sequence in a compact X-shaped 
structure. Chromatin particles form as compact 3 0nm 
assemblies in the DNA between Tl and T2 thus eliminating 
the intervening genes from promotion and expression. 
Connectrons remove sets of genes from expression and thus 

25 modulate the behavior of many types of cells. 
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